Contrastive Cross-Modal Pre-Training: A General Strategy for Small Sample Medical Imaging

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

A key challenge in training neural networks for a given medical imaging task is the difficulty of obtaining a sufficient number of manually labeled examples. In contrast, textual imaging reports are often readily available in medical records and contain rich but unstructured interpretations written by experts as part of standard clinical practice. We propose using these textual reports as a form of weak supervision to improve the image interpretation performance of a neural network without requiring additional manually labeled examples. We use an image-text matching task to train a feature extractor and then fine-tune it in a transfer learning setting for a supervised task using a small labeled dataset. The end result is a neural network that automatically interprets imagery without requiring textual reports during inference. We evaluate our method on three classification tasks and find consistent performance improvements, reducing the need for labeled data by 67%-98%.

Original languageEnglish
Pages (from-to)1640-1649
Number of pages10
JournalIEEE Journal of Biomedical and Health Informatics
Volume26
Issue number4
DOIs
StatePublished - Apr 1 2022

Bibliographical note

Publisher Copyright:
© 2013 IEEE.

Funding

Manuscript received December 23, 2020; revised June 3, 2021 and August 24, 2021; accepted August 30, 2021. Date of publication September 8, 2021; date of current version April 13, 2022. This work was supported in part the National Science Foundation under Grant IIS-1553116, in part by the American Cancer Society under Grant IRG-19-140-31, and in part by the National Cancer Institute under Grant P30CA177558. (Corresponding author: Gongbo Liang.) Gongbo Liang is with the Department of Computer Science, University of Kentucky, Lexington, KY 40506 USA, and also with the Department of Computer Science and Information Technology, Eastern Kentucky University, Richmond, KY 40475 USA (e-mail: [email protected]).

FundersFunder number
National Science Foundation Arctic Social Science ProgramIIS-1553116
National Science Foundation Arctic Social Science Program
American Cancer Society-Michigan Cancer Research FundIRG-19-140-31
American Cancer Society-Michigan Cancer Research Fund
National Childhood Cancer Registry – National Cancer InstituteP30CA177558
National Childhood Cancer Registry – National Cancer Institute
National Center for Advancing Translational Sciences (NCATS)UL1TR001998
National Center for Advancing Translational Sciences (NCATS)

    Keywords

    • Annotation-efficient modeling
    • convolutional neural network
    • pre-training
    • text-image matching

    ASJC Scopus subject areas

    • Computer Science Applications
    • Health Informatics
    • Electrical and Electronic Engineering
    • Health Information Management

    Fingerprint

    Dive into the research topics of 'Contrastive Cross-Modal Pre-Training: A General Strategy for Small Sample Medical Imaging'. Together they form a unique fingerprint.

    Cite this