SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits
Abstract Sample relatedness is a major confounder in genome-wide association studies (GWAS), potentially leading to inflated type I error rates if not appropriately controlled. A common strategy is to incorporate a random effect related to genetic relatedness matrix (GRM) into regression models. How...
Saved in:
Main Authors: | , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-02-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-025-56669-1 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823861853790404608 |
---|---|
author | He Xu Yuzhuo Ma Lin-lin Xu Yin Li Yufei Liu Ying Li Xu-jie Zhou Wei Zhou Seunggeun Lee Peipei Zhang Weihua Yue Wenjian Bi |
author_facet | He Xu Yuzhuo Ma Lin-lin Xu Yin Li Yufei Liu Ying Li Xu-jie Zhou Wei Zhou Seunggeun Lee Peipei Zhang Weihua Yue Wenjian Bi |
author_sort | He Xu |
collection | DOAJ |
description | Abstract Sample relatedness is a major confounder in genome-wide association studies (GWAS), potentially leading to inflated type I error rates if not appropriately controlled. A common strategy is to incorporate a random effect related to genetic relatedness matrix (GRM) into regression models. However, this approach is challenging for large-scale GWAS of complex traits, such as longitudinal traits. Here we propose a scalable and accurate analysis framework, SPAGRM, which controls for sample relatedness via a precise approximation of the joint distribution of genotypes. SPAGRM can utilize GRM-free models and thus is applicable to various trait types and statistical methods, including linear mixed models and generalized estimation equations for longitudinal traits. A hybrid strategy incorporating saddlepoint approximation greatly increases the accuracy to analyze low-frequency and rare genetic variants, especially in unbalanced phenotypic distributions. We also introduce SPAGRM(CCT) to aggregate the results following different models via Cauchy combination test. Extensive simulations and real data analyses demonstrated that SPAGRM maintains well-controlled type I error rates and SPAGRM(CCT) can serve as a broadly effective method. Applying SPAGRM to 79 longitudinal traits extracted from UK Biobank primary care data, we identified 7,463 genetic loci, making a pioneering attempt to conduct GWAS for these traits as longitudinal traits. |
format | Article |
id | doaj-art-ea6b5701bb72419cb24152fb31a58159 |
institution | Kabale University |
issn | 2041-1723 |
language | English |
publishDate | 2025-02-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj-art-ea6b5701bb72419cb24152fb31a581592025-02-09T12:44:34ZengNature PortfolioNature Communications2041-17232025-02-0116111910.1038/s41467-025-56669-1SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traitsHe Xu0Yuzhuo Ma1Lin-lin Xu2Yin Li3Yufei Liu4Ying Li5Xu-jie Zhou6Wei Zhou7Seunggeun Lee8Peipei Zhang9Weihua Yue10Wenjian Bi11Department of Medical Genetics, School of Basic Medical Sciences, Peking UniversityDepartment of Medical Genetics, School of Basic Medical Sciences, Peking UniversityRenal Division, Peking University First Hospital; Peking University Institute of NephrologyDepartment of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Peking University Health Science CenterDepartment of Medical Genetics, School of Basic Medical Sciences, Peking UniversityDepartment of Medical Genetics, School of Basic Medical Sciences, Peking UniversityRenal Division, Peking University First Hospital; Peking University Institute of NephrologyCenter for Genomic Medicine, Massachusetts General HospitalGraduate School of Data Science, Seoul National UniversityDepartment of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Peking University Health Science CenterPeking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital)Department of Medical Genetics, School of Basic Medical Sciences, Peking UniversityAbstract Sample relatedness is a major confounder in genome-wide association studies (GWAS), potentially leading to inflated type I error rates if not appropriately controlled. A common strategy is to incorporate a random effect related to genetic relatedness matrix (GRM) into regression models. However, this approach is challenging for large-scale GWAS of complex traits, such as longitudinal traits. Here we propose a scalable and accurate analysis framework, SPAGRM, which controls for sample relatedness via a precise approximation of the joint distribution of genotypes. SPAGRM can utilize GRM-free models and thus is applicable to various trait types and statistical methods, including linear mixed models and generalized estimation equations for longitudinal traits. A hybrid strategy incorporating saddlepoint approximation greatly increases the accuracy to analyze low-frequency and rare genetic variants, especially in unbalanced phenotypic distributions. We also introduce SPAGRM(CCT) to aggregate the results following different models via Cauchy combination test. Extensive simulations and real data analyses demonstrated that SPAGRM maintains well-controlled type I error rates and SPAGRM(CCT) can serve as a broadly effective method. Applying SPAGRM to 79 longitudinal traits extracted from UK Biobank primary care data, we identified 7,463 genetic loci, making a pioneering attempt to conduct GWAS for these traits as longitudinal traits.https://doi.org/10.1038/s41467-025-56669-1 |
spellingShingle | He Xu Yuzhuo Ma Lin-lin Xu Yin Li Yufei Liu Ying Li Xu-jie Zhou Wei Zhou Seunggeun Lee Peipei Zhang Weihua Yue Wenjian Bi SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits Nature Communications |
title | SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits |
title_full | SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits |
title_fullStr | SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits |
title_full_unstemmed | SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits |
title_short | SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits |
title_sort | spagrm effectively controlling for sample relatedness in large scale genome wide association studies of longitudinal traits |
url | https://doi.org/10.1038/s41467-025-56669-1 |
work_keys_str_mv | AT hexu spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT yuzhuoma spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT linlinxu spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT yinli spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT yufeiliu spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT yingli spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT xujiezhou spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT weizhou spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT seunggeunlee spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT peipeizhang spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT weihuayue spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits AT wenjianbi spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits |