DG-GL: Differential geometry-based geometric learning of molecular datasets

Duc Duy Nguyen, Guo Wei Wei

Research output: Contribution to journalArticlepeer-review

59 Scopus citations

Abstract

Motivation: Despite its great success in various physical modeling, differential geometry (DG) has rarely been devised as a versatile tool for analyzing large, diverse, and complex molecular and biomolecular datasets because of the limited understanding of its potential power in dimensionality reduction and its ability to encode essential chemical and biological information in differentiable manifolds. Results: We put forward a differential geometry-based geometric learning (DG-GL) hypothesis that the intrinsic physics of three-dimensional (3D) molecular structures lies on a family of low-dimensional manifolds embedded in a high-dimensional data space. We encode crucial chemical, physical, and biological information into 2D element interactive manifolds, extracted from a high-dimensional structural data space via a multiscale discrete-to-continuum mapping using differentiable density estimators. Differential geometry apparatuses are utilized to construct element interactive curvatures in analytical forms for certain analytically differentiable density estimators. These low-dimensional differential geometry representations are paired with a robust machine learning algorithm to showcase their descriptive and predictive powers for large, diverse, and complex molecular and biomolecular datasets. Extensive numerical experiments are carried out to demonstrate that the proposed DG-GL strategy outperforms other advanced methods in the predictions of drug discovery-related protein-ligand binding affinity, drug toxicity, and molecular solvation free energy. Availability and implementation: http://weilab.math.msu.edu/DG-GL/. Contact: [email protected].

Original languageEnglish
Article numbere3179
JournalInternational Journal for Numerical Methods in Biomedical Engineering
Volume35
Issue number3
DOIs
StatePublished - Mar 2019

Bibliographical note

Publisher Copyright:
© 2019 John Wiley & Sons, Ltd.

Keywords

  • biomolecular data
  • drug discovery
  • geometric data analysis
  • machine learning

ASJC Scopus subject areas

  • Software
  • Biomedical Engineering
  • Modeling and Simulation
  • Molecular Biology
  • Computational Theory and Mathematics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'DG-GL: Differential geometry-based geometric learning of molecular datasets'. Together they form a unique fingerprint.

Cite this