Knowledge-based biomedical word sense disambiguation with neural concept embeddings

Akm Sabbir, Antonio Jimeno-Yepes, Ramakanth Kavuluru

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

29 Scopus citations

Abstract

Biomedical word sense disambiguation (WSD) is an important intermediate task in many natural language processing applications such as named entity recognition, syntactic parsing, and relation extraction. In this paper, we employ knowledge-based approaches that also exploit recent advances in neural word/concept embeddings to improve over the state-of-the-art in biomedical WSD using the MSH WSD dataset as the test set. Our methods involve weak supervision - we do not use any hand-labeled examples for WSD to build our prediction models; however, we employ an existing well known named entity recognition and concept mapping program, MetaMap, to obtain our concept vectors. Over the MSH WSD dataset, our linear time (in terms of numbers of senses and words in the test instance) method achieves an accuracy of 92.24% which is an absolute 3% improvement over the best known results obtained via unsupervised or knowledge-based means. A more expensive approach that we developed relies on a nearest neighbor framework and achieves an accuracy of 94.34%, essentially cutting the error rate in half. Employing dense vector representations learned from unlabeled free text has been shown to benefit many language processing tasks recently and our efforts show that biomedical WSD is no exception to this trend. For a complex and rapidly evolving domain such as biomedicine, building labeled datasets for larger sets of ambiguous terms may be impractical. Here, we show that weak supervision that leverages recent advances in representation learning can rival supervised approaches in biomedical WSD. However, external knowledge bases (here sense inventories) play a key role in the improvements achieved.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE 17th International Conference on Bioinformatics and Bioengineering, BIBE 2017
Pages163-170
Number of pages8
ISBN (Electronic)9781538613245
DOIs
StatePublished - Jul 1 2017
Event17th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2017 - Herndon, United States
Duration: Oct 23 2017Oct 25 2017

Publication series

NameProceedings - 2017 IEEE 17th International Conference on Bioinformatics and Bioengineering, BIBE 2017
Volume2018-January

Conference

Conference17th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2017
Country/TerritoryUnited States
CityHerndon
Period10/23/1710/25/17

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Keywords

  • Knowledge based systems
  • Neural embeddings
  • Word sense disambiguation

ASJC Scopus subject areas

  • Signal Processing
  • Information Systems
  • Biomedical Engineering
  • Modeling and Simulation
  • Health Informatics

Fingerprint

Dive into the research topics of 'Knowledge-based biomedical word sense disambiguation with neural concept embeddings'. Together they form a unique fingerprint.

Cite this