Abstract
We present a custom, Boolean query generator utilizing common-table expressions (CTEs) that is capable of scaling with big datasets. The generator maps user-defined Boolean queries, such as those interactively created in clinical-research and general-purpose healthcare tools, into SQL. We demonstrate the effectiveness of this generator by integrating our study into the Informatics for Integrating Biology and the Bedside (i2b2) query tool and show that it is capable of scaling. Our custom generator replaces and outperforms the default query generator found within the Clinical Research Chart cell of i2b2. In our experiments, 16 different types of i2b2 queries were identified by varying four constraints: date, frequency, exclusion criteria, and whether selected concepts occurred in the same encounter. We generated nontrivial, random Boolean queries based on these 16 types; the corresponding SQL queries produced by both generators were compared by execution times. The CTE-based solution significantly outperformed the default query generator and provided a much more consistent response time across all query types (M = 2.03, SD = 6.64 versus M = 75.82, SD = 238.88 s). Without costly hardware upgrades, we provide a scalable solution based on CTEs with very promising empirical results centered on performance gains. The evaluation methodology used for this provides a means of profiling clinical data warehouse performance.
Original language | English |
---|---|
Article number | 6674997 |
Pages (from-to) | 1607-1613 |
Number of pages | 7 |
Journal | IEEE Journal of Biomedical and Health Informatics |
Volume | 18 |
Issue number | 5 |
DOIs | |
State | Published - Sep 2014 |
Bibliographical note
Publisher Copyright:© 2013 IEEE.
Keywords
- Biomedical computing
- biomedical informatics
- data systems
- data warehouses
- health information management
- large-scale systems
ASJC Scopus subject areas
- Computer Science Applications
- Health Informatics
- Electrical and Electronic Engineering
- Health Information Management