Purpose: Individualized therapy of lung adenocarcinoma depends on the accurate classification of patients into subgroups of poor and good prognosis, which reflects a different probability of disease recurrence and survival following therapy. However, it is currently impossible to reliably identify specific high-risk patients. Here, we propose a computational model system which accurately predicts the clinical outcome of individual patients based on their gene expression profiles. Experimental Design: Gene signatures were selected using feature selection algorithms random forests, correlation-based feature selection, and gain ratio attribute selection. Prediction models were built using random committee and Bayesian belief networks .The prognostic power of the survival predictors was also evaluated using hierarchical cluster analysis and Kaplan-Meier analysis. Results: The predictive accuracy of an identified 37-gene survival signature is 0.96 as measured by the area under the time-dependent receiver operating curves. The cluster analysis, using the 37-gene signature, aggregates the patient samples into three groups with distinct prognoses (Kaplan-Meier analysis, P < 0.0005, log-rank test). All patients in cluster 1 were in stage I, with N0 lymph node status (no metastasis) and smaller tumor size (T 1 or T 2). Additionally, a 12-gene signature correctly predicts the stage of 94.2% of patients. Conclusions: Our results show that the prediction models based on the expression levels of a small number of marker genes could accurately predict patient outcome for individualized therapy of lung adenocarcinoma. Such an individualized treatment may significantly increase survival due to the optimization of treatment procedures and improve lung cancer survival every year through the 5-year checkpoint.
|Number of pages||11|
|Journal||Clinical Cancer Research|
|Issue number||11 I|
|State||Published - Jun 1 2006|
ASJC Scopus subject areas
- Cancer Research