A Multiscale, Multiphysics Modeling Framework For Genome-To Phenome Mapping Via Intermediate Phenotypes

Grants and Contracts Details


The relationship between a genome and most phenotypic traits (e.g. plant yield, lodging resistance, etc.) is governed by complex functional relationships and nonlinear interactions that exist at multiple temporal and spatial scales. Any generalizable framework for generating mechanistic understanding of the mapping between genomic data and phenotypic traits must therefore be able to account for such complexities and interactions. While currently available state-of-the-art genomics approaches like genome wide association (GWA), gene co- expression networks (GCN), expression quantitative locus (eQTL) analyses have been successful in linking specific genes/variants/transcripts to specific phenotypes, they tend to either underestimate or overestimate the genome to phenome interactions. Thus, a more advanced and flexible mathematical/statistical framework capable of capturing complex multi-scale, nonlinear interactions and functional relationships is required to provide a complete picture of genome- phenome relationships. We propose to develop a multiscale/multiphysics modeling framework that is flexible enough to leverage prior knowledge from existing studies, expert knowledge, and biological relevance while being advanced enough to account for sophisticated nonlinear interactions. To accomplish this the modeling framework will utilize advanced statistical methods, finite element methods and intermediate phenotype data that is typically excluded from genome to phenome mapping efforts. In developing this modeling framework, we will focus on a problem that has faced agriculture for decades and limits food production globally, namely stalk lodging resistance (breaking or snapping of the plant stem prior to harvest). Development of the modeling framework will consist of three primary phases. The first will relate intermediate-phenotypes to complex traits of interest via a statistical model which includes structural components (i.e., partial differential equations predicated upon first principles and structural engineering theory). The second phase involves training, evaluating, and validating predictive models of intermediate-phenotypes, using genetic and environmental data. The final phase will capitalize on the principles of Bayesian statistics and build a modeling network which ties the previous phases of model development together into a single unified framework which can be used to relate complex traits to genomic data. Through utilizing missing data techniques, the final model can be updated and refined solely based on the complex trait, environmental data and genomic data, thus obviating the need to collect labor intensive intermediate-phenotype data.
Effective start/end date8/15/187/31/24


  • National Science Foundation: $5,999,995.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.