Convolutional neural networks for biomedical text classification: Application in indexing biomedical articles

Anthony Rios, Ramakanth Kavuluru

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

100 Scopus citations

Abstract

Building high accuracy text classifiers is an important task in biomedicine given the wealth of information hidden in unstructured narratives such as research articles and clinical documents. Due to large feature spaces, traditionally, discriminative approaches such as logistic regression and support vector machines with n-gram and semantic features (e.g., named entities) have been used for text classification where additional performance gains are typically made through feature selection and ensemble approaches. In this paper, we demonstrate that a more direct approach using convolutional neural networks (CNNs) outperforms several traditional approaches in biomedical text classification with the specific use-case of assigning medical subject headings (or MeSH terms) to biomedical articles. Trained annotators at the national library of medicine (NLM) assign on an average 13 codes to each biomedical article, thus semantically indexing scientific literature to support NLM's PubMed search system. Recent evidence suggests that effective automated efforts for MeSH term assignment start with binary classifiers for each term. In this paper, we use CNNs to build binary text classifiers and achieve an absolute improvement of over 3% in macro F-score over a set of selected hard-toclassify MeSH terms when compared with the best prior results on a public dataset. Additional experiments on 50 high frequency terms in the dataset also show improvements with CNNs. Our results indicate the strong potential of CNNs in biomedical text classification tasks.

Original languageEnglish
Title of host publicationBCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
Pages258-267
Number of pages10
ISBN (Electronic)9781450338530
DOIs
StatePublished - Sep 9 2015
Event6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2015 - Atlanta, United States
Duration: Sep 9 2015Sep 12 2015

Publication series

NameBCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Conference

Conference6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2015
Country/TerritoryUnited States
CityAtlanta
Period9/9/159/12/15

Bibliographical note

Publisher Copyright:
Copyright 2015 ACM.

Keywords

  • Convolutional neural networks
  • Medical subject headings
  • Text classification

ASJC Scopus subject areas

  • Software
  • Health Informatics
  • Computer Science Applications
  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'Convolutional neural networks for biomedical text classification: Application in indexing biomedical articles'. Together they form a unique fingerprint.

Cite this