TY - JOUR
T1 - Development of message passing-based graph convolutional networks for classifying cancer pathology reports
AU - Yoon, Hong Jun
AU - Klasky, Hilda B.
AU - Blanchard, Andrew E
AU - Christian, J. Blair
AU - Durbin, Eric B
AU - Wu, Xiao Cheng
AU - Stroup, Antoinette
AU - Doherty, Jennifer
AU - Coyle, Linda
AU - Penberthy, Lynne
AU - Tourassi, Georgia D
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Background: Applying graph convolutional networks (GCN) to the classification of free-form natural language texts leveraged by graph-of-words features (TextGCN) was studied and confirmed to be an effective means of describing complex natural language texts. However, the text classification models based on the TextGCN possess weaknesses in terms of memory consumption and model dissemination and distribution. In this paper, we present a fast message passing network (FastMPN), implementing a GCN with message passing architecture that provides versatility and flexibility by allowing trainable node embedding and edge weights, helping the GCN model find the better solution. We applied the FastMPN model to the task of clinical information extraction from cancer pathology reports, extracting the following six properties: main site, subsite, laterality, histology, behavior, and grade. Results: We evaluated the clinical task performance of the FastMPN models in terms of micro- and macro-averaged F1 scores. A comparison was performed with the multi-task convolutional neural network (MT-CNN) model. Results show that the FastMPN model is equivalent to or better than the MT-CNN. Conclusions: Our implementation revealed that our FastMPN model, which is based on the PyTorch platform, can train a large corpus (667,290 training samples) with 202,373 unique words in less than 3 minutes per epoch using one NVIDIA V100 hardware accelerator. Our experiments demonstrated that using this implementation, the clinical task performance scores of information extraction related to tumors from cancer pathology reports were highly competitive.
AB - Background: Applying graph convolutional networks (GCN) to the classification of free-form natural language texts leveraged by graph-of-words features (TextGCN) was studied and confirmed to be an effective means of describing complex natural language texts. However, the text classification models based on the TextGCN possess weaknesses in terms of memory consumption and model dissemination and distribution. In this paper, we present a fast message passing network (FastMPN), implementing a GCN with message passing architecture that provides versatility and flexibility by allowing trainable node embedding and edge weights, helping the GCN model find the better solution. We applied the FastMPN model to the task of clinical information extraction from cancer pathology reports, extracting the following six properties: main site, subsite, laterality, histology, behavior, and grade. Results: We evaluated the clinical task performance of the FastMPN models in terms of micro- and macro-averaged F1 scores. A comparison was performed with the multi-task convolutional neural network (MT-CNN) model. Results show that the FastMPN model is equivalent to or better than the MT-CNN. Conclusions: Our implementation revealed that our FastMPN model, which is based on the PyTorch platform, can train a large corpus (667,290 training samples) with 202,373 unique words in less than 3 minutes per epoch using one NVIDIA V100 hardware accelerator. Our experiments demonstrated that using this implementation, the clinical task performance scores of information extraction related to tumors from cancer pathology reports were highly competitive.
KW - Cancer pathology reports
KW - Deep learning
KW - Graph
KW - Graph convolutional networks
KW - Graph of words
KW - Information extraction
KW - Message passing networks
KW - Natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85204297294&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85204297294&partnerID=8YFLogxK
U2 - 10.1186/s12911-024-02662-5
DO - 10.1186/s12911-024-02662-5
M3 - Article
C2 - 39289714
AN - SCOPUS:85204297294
SN - 1472-6947
VL - 24
JO - BMC Medical Informatics and Decision Making
JF - BMC Medical Informatics and Decision Making
IS - Suppl 5
M1 - 262
ER -