Predicting the Association of Metabolites with Both Pathway Categories and Individual Pathways

Erik D. Huckvale, Hunter N.B. Moseley

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Metabolism is a network of chemical reactions that sustain cellular life. Parts of this metabolic network are defined as metabolic pathways containing specific biochemical reactions. Products and reactants of these reactions are called metabolites, which are associated with certain human-defined metabolic pathways. Metabolic knowledgebases, such as the Kyoto Encyclopedia of Gene and Genomes (KEGG) contain metabolites, reactions, and pathway annotations; however, such resources are incomplete due to current limits of metabolic knowledge. To fill in missing metabolite pathway annotations, past machine learning models showed some success at predicting the KEGG Level 2 pathway category involvement of metabolites based on their chemical structure. Here, we present the first machine learning model to predict metabolite association to more granular KEGG Level 3 metabolic pathways. We used a feature and dataset engineering approach to generate over one million metabolite-pathway entries in the dataset used to train a single binary classifier. This approach produced a mean Matthews correlation coefficient (MCC) of 0.806 ± 0.017 SD across 100 cross-validation iterations. The 172 Level 3 pathways were predicted with an overall MCC of 0.726. Moreover, metabolite association with the 12 Level 2 pathway categories was predicted with an overall MCC of 0.891, representing significant transfer learning from the Level 3 pathway entries. These are the best metabolite pathway prediction results published so far in the field.

Original languageEnglish
Article number510
JournalMetabolites
Volume14
Issue number9
DOIs
StatePublished - Sep 2024

Bibliographical note

Publisher Copyright:
© 2024 by the authors.

Keywords

  • binary classification
  • biochemistry
  • machine learning
  • metabolism
  • metabolites
  • multi-layer perceptron
  • pathways
  • supervised learning
  • transfer learning

ASJC Scopus subject areas

  • Endocrinology, Diabetes and Metabolism
  • Biochemistry
  • Molecular Biology

Fingerprint

Dive into the research topics of 'Predicting the Association of Metabolites with Both Pathway Categories and Individual Pathways'. Together they form a unique fingerprint.

Cite this