On interestingness measures for mining statistically significant and novel clinical associations from EMRs

Orhan Abar, Richard J. Charnigo, Abner Rayapati, Ramakanth Kavuluru

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Association rule mining has received significant attention from both the data mining and machine learning communities. While data mining researchers focus more on designing efficient algorithms to mine rules from large datasets, the learning community has explored applications of rule mining to classification. A major problem with rule mining algorithms is the explosion of rules even for moderate sized datasets making it very difficult for end users to identify both statistically significant and potentially novel rules that could lead to interesting new insights and hypotheses. Researchers have proposed many domain independent interestingness measures using which, one can rank the rules and potentially glean useful rules from the top ranked ones. However, these measures have not been fully explored for rule mining in clinical datasets owing to the relatively large sizes of the datasets often encountered in healthcare and also due to limited access to domain experts for review/analysis. In this paper, using an electronic medical record (EMR) dataset of diagnoses and medications from over three million patient visits to the University of Kentucky medical center and affiliated clinics, we conduct a thorough evaluation of dozens of interestingness measures proposed in data mining literature, including some new composite measures. Using cumulative relevance metrics from information retrieval, we compare these interestingness measures against human judgments obtained from a practicing psychiatrist for association rules involving the depressive disorders class as the consequent. Our results not only surface new interesting associations for depressive disorders but also indicate classes of interestingness measures that weight rule novelty and statistical strength in contrasting ways, offering new insights for end users in identifying interesting rules.

Original languageEnglish
Title of host publicationACM-BCB 2016 - 7th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
Pages587-594
Number of pages8
ISBN (Electronic)9781450342254
DOIs
StatePublished - Oct 2 2016
Event7th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2016 - Seattle, United States
Duration: Oct 2 2016Oct 5 2016

Publication series

NameACM-BCB 2016 - 7th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Conference

Conference7th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2016
Country/TerritoryUnited States
CitySeattle
Period10/2/1610/5/16

Bibliographical note

Funding Information:
We are grateful to anonymous reviewers for their helpful comments that improved the presentation of this paper and for interesting suggestions to extend our work using a event sequence framework. This work is supported by the National Center for Advancing Translational Sciences through Grant UL1TR000117 and the Kentucky Lung Cancer Research Program through Grant PO2-415-1400004000-1. The content of this paper is the responsibility of the authors and does not necessarily represent the official views of the NIH.

Publisher Copyright:
Copyright 2016 ACM.

Keywords

  • Association rule mining
  • Electronic medical records
  • Rule interestingness measures

ASJC Scopus subject areas

  • Software
  • Health Informatics
  • Biomedical Engineering
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'On interestingness measures for mining statistically significant and novel clinical associations from EMRs'. Together they form a unique fingerprint.

Cite this