Abstract
Accurate prediction of ligand-receptor binding affinity is crucial in structure-based drug design, significantly impacting the development of effective drugs. Recent advances in machine learning (ML)–based scoring functions have improved these predictions, yet challenges remain in modeling complex molecular interactions. This study introduces the AGL-EAT-Score, a scoring function that integrates extended atom-type multiscale weighted colored subgraphs with algebraic graph theory. This approach leverages the eigenvalues and eigenvectors of graph Laplacian and adjacency matrices to capture high-level details of specific atom pairwise interactions. Evaluated against benchmark datasets such as CASF-2016, CASF-2013, and the Cathepsin S dataset, the AGL-EAT-Score demonstrates notable accuracy, outperforming existing traditional and ML-based methods. The model’s strength lies in its comprehensive similarity analysis, examining protein sequence, ligand structure, and binding site similarities, thus ensuring minimal bias and over-representation in the training sets. The use of extended atom types in graph coloring enhances the model’s capability to capture the intricacies of protein-ligand interactions. The AGL-EAT-Score marks a significant advancement in drug design, offering a tool that could potentially refine and accelerate the drug discovery process. Scientific Contribution The AGL-EAT-Score presents an algebraic graph-based framework that predicts ligand-receptor binding affinity by constructing multiscale weighted colored subgraphs from the 3D structure of protein-ligand complexes. It improves prediction accuracy by modeling interactions between extended atom types, addressing challenges like dataset bias and over-representation. Benchmark evaluations demonstrate that AGL-EAT-Score outperforms existing methods, offering a robust and systematic tool for structure-based drug design.
Original language | English |
---|---|
Article number | 10 |
Journal | Journal of Cheminformatics |
Volume | 17 |
Issue number | 1 |
DOIs | |
State | Published - Dec 2025 |
Bibliographical note
Publisher Copyright:© The Author(s) 2025.
Funding
The authors thank the anonymous reviewers for their valuable suggestions. This work is supported in part by funds from the National Science Foundation (NSF: # 2053284, # 2151802, and # 2245903), the University of Kentucky Startup Fund, and Markey Cancer Research Informatics Shared Resource Facility (P30 CA177558). We would also like to thank OpenEye Scientific Software for providing ROCS software.
Funders | Funder number |
---|---|
University of Kentucky Startup Fund | |
U.S. Department of Energy Chinese Academy of Sciences Guangzhou Municipal Science and Technology Project Oak Ridge National Laboratory Extreme Science and Engineering Discovery Environment National Science Foundation National Energy Research Scientific Computing Center National Natural Science Foundation of China | 2053284, 2245903, 2151802 |
Markey Cancer Research Informatics Shared Resource Facility National Cancer Institute | P30 CA177558 |
Keywords
- Algebraic graph learning
- Binding affinity predictions
- Extended atom type
- Non-redundant training sets
- Protein-ligand interactions
- Similarity computation
ASJC Scopus subject areas
- Computer Science Applications
- Physical and Theoretical Chemistry
- Computer Graphics and Computer-Aided Design
- Library and Information Sciences