TY - JOUR
T1 - Enhancing prediction accuracy of grain yield in wheat lines adapted to the southeastern United States through multivariate and multi-environment genomic prediction models incorporating spectral and thermal information
AU - McBreen, Jordan
AU - Babar, Md Ali
AU - Jarquin, Diego
AU - Khan, Naeem
AU - Harrison, Steve
AU - DeWitt, Noah
AU - Mergoum, Mohamed
AU - Lopez, Ben
AU - Boyles, Richard
AU - Lyerly, Jeanette
AU - Murphy, J. Paul
AU - Shakiba, Ehsan
AU - Sutton, Russel
AU - Ibrahim, Amir
AU - Howell, Kimberly
AU - Smith, Jared H.
AU - Brown-Guedira, Gina
AU - Tiwari, Vijay
AU - Santantonio, Nicholas
AU - Van Sanford, David A.
N1 - Publisher Copyright:
© 2024 The Author(s). The Plant Genome published by Wiley Periodicals LLC on behalf of Crop Science Society of America.
PY - 2025/3
Y1 - 2025/3
N2 - Enhancing predictive modeling accuracy in wheat (Triticum aestivum) breeding through the integration of high-throughput phenotyping (HTP) data with genomic information is crucial for maximizing genetic gain. In this study, spanning four locations in the southeastern United States over 3 years, models to predict grain yield (GY) were investigated through different cross-validation approaches. The results demonstrate the superiority of multivariate comprehensive models that incorporate both genomic and HTP data, particularly in accurately predicting GY across diverse locations and years. These HTP-incorporating models achieve prediction accuracies ranging from 0.59 to 0.68, compared to 0.40–0.54 for genomic-only models when tested under different prediction scenarios both across years and locations. The comprehensive models exhibit superior generalization to new environments and achieve the highest accuracy when trained on diverse datasets. Predictive accuracy improves as models incorporate data from multiple years, highlighting the importance of considering temporal dynamics in modeling approaches. The study reveals that multivariate prediction outperformed genomic prediction methods in predicting lines across years and locations. The percentage of top 25% lines selected based on multivariate prediction was higher compared to genomic-only models, indicated by higher specificity, which is the proportion of correctly identified top-yielding lines that matched the observed top 25% performance across different sites and years. Additionally, the study addresses the prediction of untested locations based on other locations within the same year and in new years at previously tested locations. Findings show the comprehensive models effectively extrapolate to new environments, highlighting their potential for guiding breeding strategies.
AB - Enhancing predictive modeling accuracy in wheat (Triticum aestivum) breeding through the integration of high-throughput phenotyping (HTP) data with genomic information is crucial for maximizing genetic gain. In this study, spanning four locations in the southeastern United States over 3 years, models to predict grain yield (GY) were investigated through different cross-validation approaches. The results demonstrate the superiority of multivariate comprehensive models that incorporate both genomic and HTP data, particularly in accurately predicting GY across diverse locations and years. These HTP-incorporating models achieve prediction accuracies ranging from 0.59 to 0.68, compared to 0.40–0.54 for genomic-only models when tested under different prediction scenarios both across years and locations. The comprehensive models exhibit superior generalization to new environments and achieve the highest accuracy when trained on diverse datasets. Predictive accuracy improves as models incorporate data from multiple years, highlighting the importance of considering temporal dynamics in modeling approaches. The study reveals that multivariate prediction outperformed genomic prediction methods in predicting lines across years and locations. The percentage of top 25% lines selected based on multivariate prediction was higher compared to genomic-only models, indicated by higher specificity, which is the proportion of correctly identified top-yielding lines that matched the observed top 25% performance across different sites and years. Additionally, the study addresses the prediction of untested locations based on other locations within the same year and in new years at previously tested locations. Findings show the comprehensive models effectively extrapolate to new environments, highlighting their potential for guiding breeding strategies.
UR - http://www.scopus.com/inward/record.url?scp=85209792454&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85209792454&partnerID=8YFLogxK
U2 - 10.1002/tpg2.20532
DO - 10.1002/tpg2.20532
M3 - Article
AN - SCOPUS:85209792454
SN - 1940-3372
VL - 18
JO - Plant Genome
JF - Plant Genome
IS - 1
M1 - e20532
ER -