SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits

Abstract Sample relatedness is a major confounder in genome-wide association studies (GWAS), potentially leading to inflated type I error rates if not appropriately controlled. A common strategy is to incorporate a random effect related to genetic relatedness matrix (GRM) into regression models. How...

Full description

Saved in:
Bibliographic Details
Main Authors: He Xu, Yuzhuo Ma, Lin-lin Xu, Yin Li, Yufei Liu, Ying Li, Xu-jie Zhou, Wei Zhou, Seunggeun Lee, Peipei Zhang, Weihua Yue, Wenjian Bi
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-025-56669-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823861853790404608
author He Xu
Yuzhuo Ma
Lin-lin Xu
Yin Li
Yufei Liu
Ying Li
Xu-jie Zhou
Wei Zhou
Seunggeun Lee
Peipei Zhang
Weihua Yue
Wenjian Bi
author_facet He Xu
Yuzhuo Ma
Lin-lin Xu
Yin Li
Yufei Liu
Ying Li
Xu-jie Zhou
Wei Zhou
Seunggeun Lee
Peipei Zhang
Weihua Yue
Wenjian Bi
author_sort He Xu
collection DOAJ
description Abstract Sample relatedness is a major confounder in genome-wide association studies (GWAS), potentially leading to inflated type I error rates if not appropriately controlled. A common strategy is to incorporate a random effect related to genetic relatedness matrix (GRM) into regression models. However, this approach is challenging for large-scale GWAS of complex traits, such as longitudinal traits. Here we propose a scalable and accurate analysis framework, SPAGRM, which controls for sample relatedness via a precise approximation of the joint distribution of genotypes. SPAGRM can utilize GRM-free models and thus is applicable to various trait types and statistical methods, including linear mixed models and generalized estimation equations for longitudinal traits. A hybrid strategy incorporating saddlepoint approximation greatly increases the accuracy to analyze low-frequency and rare genetic variants, especially in unbalanced phenotypic distributions. We also introduce SPAGRM(CCT) to aggregate the results following different models via Cauchy combination test. Extensive simulations and real data analyses demonstrated that SPAGRM maintains well-controlled type I error rates and SPAGRM(CCT) can serve as a broadly effective method. Applying SPAGRM to 79 longitudinal traits extracted from UK Biobank primary care data, we identified 7,463 genetic loci, making a pioneering attempt to conduct GWAS for these traits as longitudinal traits.
format Article
id doaj-art-ea6b5701bb72419cb24152fb31a58159
institution Kabale University
issn 2041-1723
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-ea6b5701bb72419cb24152fb31a581592025-02-09T12:44:34ZengNature PortfolioNature Communications2041-17232025-02-0116111910.1038/s41467-025-56669-1SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traitsHe Xu0Yuzhuo Ma1Lin-lin Xu2Yin Li3Yufei Liu4Ying Li5Xu-jie Zhou6Wei Zhou7Seunggeun Lee8Peipei Zhang9Weihua Yue10Wenjian Bi11Department of Medical Genetics, School of Basic Medical Sciences, Peking UniversityDepartment of Medical Genetics, School of Basic Medical Sciences, Peking UniversityRenal Division, Peking University First Hospital; Peking University Institute of NephrologyDepartment of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Peking University Health Science CenterDepartment of Medical Genetics, School of Basic Medical Sciences, Peking UniversityDepartment of Medical Genetics, School of Basic Medical Sciences, Peking UniversityRenal Division, Peking University First Hospital; Peking University Institute of NephrologyCenter for Genomic Medicine, Massachusetts General HospitalGraduate School of Data Science, Seoul National UniversityDepartment of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Peking University Health Science CenterPeking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital)Department of Medical Genetics, School of Basic Medical Sciences, Peking UniversityAbstract Sample relatedness is a major confounder in genome-wide association studies (GWAS), potentially leading to inflated type I error rates if not appropriately controlled. A common strategy is to incorporate a random effect related to genetic relatedness matrix (GRM) into regression models. However, this approach is challenging for large-scale GWAS of complex traits, such as longitudinal traits. Here we propose a scalable and accurate analysis framework, SPAGRM, which controls for sample relatedness via a precise approximation of the joint distribution of genotypes. SPAGRM can utilize GRM-free models and thus is applicable to various trait types and statistical methods, including linear mixed models and generalized estimation equations for longitudinal traits. A hybrid strategy incorporating saddlepoint approximation greatly increases the accuracy to analyze low-frequency and rare genetic variants, especially in unbalanced phenotypic distributions. We also introduce SPAGRM(CCT) to aggregate the results following different models via Cauchy combination test. Extensive simulations and real data analyses demonstrated that SPAGRM maintains well-controlled type I error rates and SPAGRM(CCT) can serve as a broadly effective method. Applying SPAGRM to 79 longitudinal traits extracted from UK Biobank primary care data, we identified 7,463 genetic loci, making a pioneering attempt to conduct GWAS for these traits as longitudinal traits.https://doi.org/10.1038/s41467-025-56669-1
spellingShingle He Xu
Yuzhuo Ma
Lin-lin Xu
Yin Li
Yufei Liu
Ying Li
Xu-jie Zhou
Wei Zhou
Seunggeun Lee
Peipei Zhang
Weihua Yue
Wenjian Bi
SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits
Nature Communications
title SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits
title_full SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits
title_fullStr SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits
title_full_unstemmed SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits
title_short SPAGRM: effectively controlling for sample relatedness in large-scale genome-wide association studies of longitudinal traits
title_sort spagrm effectively controlling for sample relatedness in large scale genome wide association studies of longitudinal traits
url https://doi.org/10.1038/s41467-025-56669-1
work_keys_str_mv AT hexu spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT yuzhuoma spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT linlinxu spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT yinli spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT yufeiliu spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT yingli spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT xujiezhou spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT weizhou spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT seunggeunlee spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT peipeizhang spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT weihuayue spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits
AT wenjianbi spagrmeffectivelycontrollingforsamplerelatednessinlargescalegenomewideassociationstudiesoflongitudinaltraits