Non-Negative Matrix Factorization for Drug Repositioning: Experiments with the repoDB Dataset

Gokhan Bakal, Halil Kilicoglu, Ramakanth Kavuluru

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Computational methods for drug repositioning are gaining mainstream attention with the availability of experimental gene expression datasets and manually curated relational information in knowledge bases. When building repurpos-ing tools, a fundamental limitation is the lack of gold standard datasets that contain realistic true negative examples of drug-disease pairs that were shown to be non-indications. To address this gap, the repoDB dataset was created in 2017 as a first of its kind realistic resource to benchmark drug repositioning methods - its positive examples are drawn from FDA approved indications and negatives examples are derivedfrom failed clinical trials. In this paper, we present the first effort for repositioning that directly tests against repoDB instances. By using hand-curated drug-disease indications from the UMLS Metathesaurus and automatically extracted relations from the SemMedDB database, we employ non-negative matrix factorization (NMF) methods to recover repoDB positive indications. Among recoverable approved indications, our NMF methods achieve 96% recall with 80% precision providing further evidence that hand-curated knowledge and matrix completion methods can be exploited for hypothesis generation.

Original languageEnglish
Pages (from-to)238-247
Number of pages10
JournalAMIA ... Annual Symposium proceedings. AMIA Symposium
StatePublished - 2019

Bibliographical note

Publisher Copyright:
©2019 AMIA - All rights reserved.

ASJC Scopus subject areas

  • General Medicine


Dive into the research topics of 'Non-Negative Matrix Factorization for Drug Repositioning: Experiments with the repoDB Dataset'. Together they form a unique fingerprint.

Cite this