Geometric graph learning with extended atom-types features for protein-ligand binding affinity prediction

Md Masud Rana, Duc Duy Nguyen

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Understanding and accurately predicting protein-ligand binding affinity are essential in the drug design and discovery process. At present, machine learning-based methodologies are gaining popularity as a means of predicting binding affinity due to their efficiency and accuracy, as well as the increasing availability of structural and binding affinity data for protein-ligand complexes. In biomolecular studies, graph theory has been widely applied since graphs can be used to model molecules or molecular complexes in a natural manner. In the present work, we upgrade the graph-based learners for the study of protein-ligand interactions by integrating extensive atom types such as SYBYL and extended connectivity interactive features (ECIF) into multiscale weighted colored graphs (MWCG). By pairing with the gradient boosting decision tree (GBDT) machine learning algorithm, our approach results in two different methods, namely sybylGGL-Score and ecifGGL-Score. Both of our models are extensively validated in their scoring power using three commonly used benchmark datasets in the drug design area, namely CASF-2007, CASF-2013, and CASF-2016. The performance of our best model sybylGGL-Score is compared with other state-of-the-art models in the binding affinity prediction for each benchmark. While both of our models achieve state-of-the-art results, the SYBYL atom-type model sybylGGL-Score outperforms other methods by a wide margin in all benchmarks. Finally, the best-performing SYBYL atom-type model is evaluated on two test sets that are independent of CASF benchmarks.

Original languageEnglish
Article number107250
JournalComputers in Biology and Medicine
Volume164
DOIs
StatePublished - Sep 2023

Bibliographical note

Publisher Copyright:
© 2023 Elsevier Ltd

Funding

The authors thank the anonymous reviewers for their valuable suggestions. This work is supported in part by funds from the National Science Foundation, United States (NSF: # 2053284 , # 2151802 , and # 2245903 ), and the University of Kentucky Startup Fund .

FundersFunder number
University of Kentucky Startup Fund
National Science Foundation (NSF)2053284, 2245903, 2151802

    Keywords

    • Atom-type interaction
    • Geometric graph learning
    • Machine learning
    • Protein-ligand binding affinity
    • Weighted colored subgraph

    ASJC Scopus subject areas

    • Health Informatics
    • Computer Science Applications

    Fingerprint

    Dive into the research topics of 'Geometric graph learning with extended atom-types features for protein-ligand binding affinity prediction'. Together they form a unique fingerprint.

    Cite this