TY - JOUR
T1 - The impact of high-order features on performance of radiomics studies in CT non-small cell lung cancer
AU - Ge, Gary
AU - Zhang, Jason Z.
AU - Zhang, Jie
N1 - Publisher Copyright:
© 2024
PY - 2024/9
Y1 - 2024/9
N2 - High-order radiomic features have been shown to produce high performance models in a variety of scenarios. However, models trained without high-order features have shown similar performance, raising the question of whether high-order features are worth including given their increased computational burden. This comparative study investigates the impact of high-order features on model performance in CT-based Non-Small Cell Lung Cancer (NSCLC) and the potential uncertainty regarding their application in machine learning. Three categories of features were retrospectively retrieved from CT images of 347 NSCLC patients: first- and second-order statistical features, morphological features and transform (high-order) features. From these, three datasets were constructed: a “low-order” dataset (Lo) which included the first-order, second-order, and morphological features, a high-order dataset (Hi), and a combined dataset (Combo). A diverse selection of datasets, feature selection methods, and predictive models were included for the uncertainty analysis, with two-year survival as the study endpoint. AUC values were calculated for comparisons and Kruskal-Wallis testing was performed to determine significant differences. The Hi (AUC: 0.41–0.62) and Combo (AUC: 0.41–0.62) datasets generate significantly (P < 0.01) higher model performance than the Lo dataset (AUC: 0.42–0.58). High-order features are selected more often than low-order features for model training, comprising 87 % of selected features in the Combo dataset. High-order features are a source of data that can improve machine learning model performance. However, its impact strongly depends on various factors that may lead to inconsistent results. A clear approach to incorporate high-order features in radiomic studies requires further investigation.
AB - High-order radiomic features have been shown to produce high performance models in a variety of scenarios. However, models trained without high-order features have shown similar performance, raising the question of whether high-order features are worth including given their increased computational burden. This comparative study investigates the impact of high-order features on model performance in CT-based Non-Small Cell Lung Cancer (NSCLC) and the potential uncertainty regarding their application in machine learning. Three categories of features were retrospectively retrieved from CT images of 347 NSCLC patients: first- and second-order statistical features, morphological features and transform (high-order) features. From these, three datasets were constructed: a “low-order” dataset (Lo) which included the first-order, second-order, and morphological features, a high-order dataset (Hi), and a combined dataset (Combo). A diverse selection of datasets, feature selection methods, and predictive models were included for the uncertainty analysis, with two-year survival as the study endpoint. AUC values were calculated for comparisons and Kruskal-Wallis testing was performed to determine significant differences. The Hi (AUC: 0.41–0.62) and Combo (AUC: 0.41–0.62) datasets generate significantly (P < 0.01) higher model performance than the Lo dataset (AUC: 0.42–0.58). High-order features are selected more often than low-order features for model training, comprising 87 % of selected features in the Combo dataset. High-order features are a source of data that can improve machine learning model performance. However, its impact strongly depends on various factors that may lead to inconsistent results. A clear approach to incorporate high-order features in radiomic studies requires further investigation.
KW - CT
KW - High-order features
KW - Machine learning
KW - NSCLC
KW - Radiomics
UR - https://www.scopus.com/pages/publications/85200123214
UR - https://www.scopus.com/inward/citedby.url?scp=85200123214&partnerID=8YFLogxK
U2 - 10.1016/j.clinimag.2024.110244
DO - 10.1016/j.clinimag.2024.110244
M3 - Article
C2 - 39096890
AN - SCOPUS:85200123214
SN - 0899-7071
VL - 113
JO - Clinical Imaging
JF - Clinical Imaging
M1 - 110244
ER -