Optimizing fully-efficient two-stage models for genomic selection using open-source software

Abstract Genomic-assisted breeding has transitioned from theoretical concepts to practical applications in breeding. Genomic selection (GS) predicts genomic breeding values (GEBV) using dense genetic markers. Single-stage models predict GEBVs from phenotypic observations in one step, fully accountin...

Full description

Saved in:

Bibliographic Details
Main Authors:	Javier Fernández-González, Julio Isidro y Sánchez
Format:	Article
Language:	English
Published:	BMC 2025-02-01
Series:	Plant Methods
Subjects:	Two-stage models Fully-efficient Variance-covariance Open-source Weighted regression Genomic prediction
Online Access:	https://doi.org/10.1186/s13007-024-01318-9
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1823862221344604160
author	Javier Fernández-González Julio Isidro y Sánchez
author_facet	Javier Fernández-González Julio Isidro y Sánchez
author_sort	Javier Fernández-González
collection	DOAJ
description	Abstract Genomic-assisted breeding has transitioned from theoretical concepts to practical applications in breeding. Genomic selection (GS) predicts genomic breeding values (GEBV) using dense genetic markers. Single-stage models predict GEBVs from phenotypic observations in one step, fully accounting for the entire variance-covariance structure among genotypes, but face computational challenges. Two-stage models, preferred for their simplicity and efficiency, first calculate adjusted genotypic means accounting for spatial variation within each environment, then use these means to predict GEBVs. However, unweighted (UNW) two-stage models assume independent errors among adjusted means, neglecting correlations among estimation errors. Here, we show that fully-efficient two-stage models perform similarly to UNW models for randomized complete block designs but substantially better for augmented designs. Our simulation studies demonstrate the impact of the fully-efficient methodology on prediction accuracy across different implementations and scenarios. Incorporating non-additive effects and augmented designs significantly improved accuracy, emphasizing the synergy between design and model strategy. Consistent performance requires the estimation error covariance to be incorporated into a random effect (Full_R model) rather than into the residuals. Our results suggest that the fully-efficient methodology, particularly the Full_R model, should be more prevalent, especially as GS increases the appeal of sparse designs. We also provide a comprehensive theoretical background and open-source R code, enhancing understanding and facilitating broader adoption of fully-efficient two-stage models in GS. Here, we offer insights into the practical applications of fully-efficient models and their potential to increase genetic gain, demonstrating a $$13.80\%$$ 13.80 % improvement after five selection cycles when moving from UNW to Full_R models.
format	Article
id	doaj-art-aec0522c9e854ab5b8f700d5b2ec7046
institution	Kabale University
issn	1746-4811
language	English
publishDate	2025-02-01
publisher	BMC
record_format	Article
series	Plant Methods
spelling	doaj-art-aec0522c9e854ab5b8f700d5b2ec70462025-02-09T12:38:44ZengBMCPlant Methods1746-48112025-02-0121112410.1186/s13007-024-01318-9Optimizing fully-efficient two-stage models for genomic selection using open-source softwareJavier Fernández-González0Julio Isidro y Sánchez1Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA) - Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA)Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA) - Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA)Abstract Genomic-assisted breeding has transitioned from theoretical concepts to practical applications in breeding. Genomic selection (GS) predicts genomic breeding values (GEBV) using dense genetic markers. Single-stage models predict GEBVs from phenotypic observations in one step, fully accounting for the entire variance-covariance structure among genotypes, but face computational challenges. Two-stage models, preferred for their simplicity and efficiency, first calculate adjusted genotypic means accounting for spatial variation within each environment, then use these means to predict GEBVs. However, unweighted (UNW) two-stage models assume independent errors among adjusted means, neglecting correlations among estimation errors. Here, we show that fully-efficient two-stage models perform similarly to UNW models for randomized complete block designs but substantially better for augmented designs. Our simulation studies demonstrate the impact of the fully-efficient methodology on prediction accuracy across different implementations and scenarios. Incorporating non-additive effects and augmented designs significantly improved accuracy, emphasizing the synergy between design and model strategy. Consistent performance requires the estimation error covariance to be incorporated into a random effect (Full_R model) rather than into the residuals. Our results suggest that the fully-efficient methodology, particularly the Full_R model, should be more prevalent, especially as GS increases the appeal of sparse designs. We also provide a comprehensive theoretical background and open-source R code, enhancing understanding and facilitating broader adoption of fully-efficient two-stage models in GS. Here, we offer insights into the practical applications of fully-efficient models and their potential to increase genetic gain, demonstrating a $$13.80\%$$ 13.80 % improvement after five selection cycles when moving from UNW to Full_R models.https://doi.org/10.1186/s13007-024-01318-9Two-stage modelsFully-efficientVariance-covarianceOpen-sourceWeighted regressionGenomic prediction
spellingShingle	Javier Fernández-González Julio Isidro y Sánchez Optimizing fully-efficient two-stage models for genomic selection using open-source software Plant Methods Two-stage models Fully-efficient Variance-covariance Open-source Weighted regression Genomic prediction
title	Optimizing fully-efficient two-stage models for genomic selection using open-source software
title_full	Optimizing fully-efficient two-stage models for genomic selection using open-source software
title_fullStr	Optimizing fully-efficient two-stage models for genomic selection using open-source software
title_full_unstemmed	Optimizing fully-efficient two-stage models for genomic selection using open-source software
title_short	Optimizing fully-efficient two-stage models for genomic selection using open-source software
title_sort	optimizing fully efficient two stage models for genomic selection using open source software
topic	Two-stage models Fully-efficient Variance-covariance Open-source Weighted regression Genomic prediction
url	https://doi.org/10.1186/s13007-024-01318-9
work_keys_str_mv	AT javierfernandezgonzalez optimizingfullyefficienttwostagemodelsforgenomicselectionusingopensourcesoftware AT julioisidroysanchez optimizingfullyefficienttwostagemodelsforgenomicselectionusingopensourcesoftware

Optimizing fully-efficient two-stage models for genomic selection using open-source software

Similar Items