Enhancing the quality of hierarchic relations in the national cancer institute thesaurus to enable faceted query of cancer registry data

Licong Cui, Rashmie Abeysinghe, Fengbo Zheng, Shiqiang Tao, Ningzhou Zeng, Isaac Hands, Eric B. Durbin, Lori Whiteman, Lyubov Remennik, Nicholas Sioutos, Guo Qiang Zhang

Producción científica: Articlerevisión exhaustiva

5 Citas (Scopus)

Resumen

PURPOSE To audit and improve the completeness of the hierarchic (or is-a) relations of the National Cancer Institute (NCI) Thesaurus to support its role as a faceted system for querying cancer registry data. METHODS We performed quality auditing of the 19.01d version of the NCI Thesaurus. Our hybrid auditing method consisted of three main steps: computing nonlattice subgraphs, constructing lexical features for concepts in each subgraph, and performing subsumption reasoning with each subgraph to automatically suggest potentially missing is-a relations. RESULTS A total of 9,512 nonlattice subgraphs were obtained. Our method identified 925 potentially missing is-a relations in 441 nonlattice subgraphs; 72 of 176 reviewed samples were confirmed as valid missing is-a relations and have been incorporated in the newer versions of the NCI Thesaurus. CONCLUSION Autosuggested changes resulting from our auditing method can improve the structural organization of the NCI Thesaurus in supporting its new role for faceted query.

Idioma originalEnglish
Páginas (desde-hasta)392-398
Número de páginas7
PublicaciónJCO clinical cancer informatics
Volumen4
DOI
EstadoPublished - 2020

Nota bibliográfica

Publisher Copyright:
© 2020 by American Society of Clinical Oncology Licensed under the Creative Commons Attribution 4.0 License

Financiación

Supported by the National Science Foundation through Grant No. IIS7-1931134 and the National Cancer Institute, National Institutes of Health, through Grant No. R21CA231904.

FinanciadoresNúmero del financiador
National Science Foundation Arctic Social Science ProgramIIS7-1931134
National Institutes of Health (NIH)
National Childhood Cancer Registry – National Cancer InstituteR21CA231904

    ASJC Scopus subject areas

    • Oncology
    • Health Informatics
    • Cancer Research

    Huella

    Profundice en los temas de investigación de 'Enhancing the quality of hierarchic relations in the national cancer institute thesaurus to enable faceted query of cancer registry data'. En conjunto forman una huella única.

    Citar esto