Projects and Grants per year
Grants and Contracts Details
Description
Proteins are the key molecules that play an essential role in many biological and engineering
processes. Many proteins must fold into specific 3D structures to perform their functions. The recent
development of deep learning tools such as AlphaFold2 can accurately predict proteins'' plausible
folded 3D structures. Nevertheless, understanding the dynamics of proteins is equally important to
gaining a folded structure. Many crucial changes in proteins happen over extended periods, ranging
from milliseconds to seconds or even longer, beyond the time scope typical molecular dynamics
simulations can reach. Key examples include large conformational changes, protein-ligand
interactions, allosteric regulation, protein-protein interactions, molecular motors, signal transduction,
and protein aggregation. These processes are essential for protein function and are involved in various
cellular activities and diseases. We must develop methods to enable long-stride MD simulations to
enable research on these problems.
The recent development of consistency models sparks our research idea. The consistency model was
developed as an alternative to the diffusion model. The principle of the diffusion model is similar to
that of MD simulations: the system moves a small step every time. Thus, a well-trained diffusion model
should be able to predict the slight variation of protein 3D structure in each MD simulation step. Unlike
the diffusion model, the consistency model can predict the variation of the system at any time regarding
the initial conditions. In other words, a well-trained consistency model should be able to predict the
protein 3D structure at any time after the initial condition.
We propose to leverage the ability of the consistency model to develop a deep learning-based long-
stride MD simulation engine. A consistency model provides a new approach for generating images in
one step, as opposed to the iterative process used in diffusion models. Compared to diffusion models,
consistency models can generate images significantly faster. In our case, we can treat a protein
distance matrix among backbone atoms as an image to apply this method. The research will be
conducted based on two hypotheses: (a) a well-trained diffusion model can predict the step-by-step
slight change of protein 3D structure in one short MD simulation, and (b) we can develop the
consistency model to predict the variation in protein 3D structure at a long-time step. Driven by the
two hypotheses, the proposed research will include two tasks: (a) develop diffusion models to predict
variation of protein 3D structure with small time intervals, and (b) develop consistency models to
predict variation of protein 3D structure with large time intervals.
The proposed research will be conducted by the Shao group at the University of Kentucky and the Xu
group at the University of Missouri. The Shao group has expertise in protein modeling and the Xu
group has expertise in deep learning. The two groups have existing collaborations on protein language
models. We will use model protein systems, such as polyalanine chains, polyglycine chains, and small
proteins, such as Trpcage, to illustrate the performance of the developed models. The proposed
research will also explore suitable encoders and representations for protein 3D structures, the suitable
architecture of diffusion and consistency models, and generate preliminary data for future proposal
applications.
The expected outcome includes (a) the consistency models that can predict the variation in protein 3D
structure in a long-time step (such as hundreds of fs), and (b) a framework to predict the protein fold
path and other long-time events based on the development and deployment of the consistency model.
| Status | Active |
|---|---|
| Effective start/end date | 8/1/25 → 7/31/27 |
Funding
- National Science Foundation: $250,000.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.
Projects
- 1 Active