Abstract
While automated feature extraction has had tremendous success in many deep learning algorithms for image analysis and natural language processing, it does not work well for data involving complex internal structures, such as molecules. Data representations via advanced mathematics, including algebraic topology, differential geometry, and graph theory, have demonstrated superiority in a variety of biomolecular applications, however, their performance is often dependent on manual parametrization. This work introduces the auto-parametrized weighted element-specific graph neural network, dubbed AweGNN, to overcome the obstacle of this tedious parametrization process while also being a suitable technique for automated feature extraction on these internally complex biomolecular data sets. The AweGNN is a neural network model based on geometric-graph features of element-pair interactions, with its graph parameters being updated throughout the training, which results in what we call a network-enabled automatic representation (NEAR). To enhance the predictions with small data sets, we construct multi-task (MT) AweGNN models in addition to single-task (ST) AweGNN models. The proposed methods are applied to various benchmark data sets, including four data sets for quantitative toxicity analysis and another data set for solvation prediction. Extensive numerical tests show that AweGNN models can achieve state-of-the-art performance in molecular property predictions.
Original language | English |
---|---|
Article number | 104460 |
Journal | Computers in Biology and Medicine |
Volume | 134 |
DOIs | |
State | Published - Jul 2021 |
Bibliographical note
Funding Information:This work was supported in part by NIH grant GM126189, NSF Grants DMS-1721024, DMS-1761320, DMS-2052983, DMS-2053284, and IIS1900473, NASA 80NSSC21M0023, Michigan Economic Development Corporation, George Mason University award PD45722, Bristol-Myers Squibb BMS-65109, Pfizer, and University of Kentucky Start-up fund. The authors thank The IBM TJ Watson Research Center, The COVID-19 High Performance Computing Consortium, and NVIDIA for computational assistance.
Funding Information:
This work was supported in part by NIH grant GM126189, NSF Grants DMS-1721024 , DMS-1761320 , DMS-2052983 , DMS-2053284 , and IIS1900473 , NASA 80NSSC21M0023 , Michigan Economic Development Corporation , George Mason University award PD45722, Bristol-Myers Squibb BMS-65109, Pfizer , and University of Kentucky Start-up fund. The authors thank The IBM TJ Watson Research Center, The COVID-19 High Performance Computing Consortium, and NVIDIA for computational assistance.
Publisher Copyright:
© 2021 Elsevier Ltd
Keywords
- Automated feature extraction
- Deep neural network
- Mathematical representation
- Solvation
- Toxicity
ASJC Scopus subject areas
- Computer Science Applications
- Health Informatics