With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene's expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) were then optimized by either the coordinate descent method or the threshold gradient descent regularization method. By applying the proposed methods to simulated data and a traumatic injury dataset, we have demonstrated that the proposed methods, especially for the combination of sign average and threshold gradient descent regularization, outperform other competitive algorithms. To conclude, the proposed methods are highly recommended for studies with the objective of carrying out feature selection for longitudinal gene expression data.
|Journal||BioMed Research International|
|State||Published - 2019|
Bibliographical noteFunding Information:
Thisstudy wassupported by funding (No. 31401123) from the Natural Science Foundation of China. The authors thank the Markey Cancer Center’s Research Communications Office for assistance with preparing this manuscript.
© 2019 Suyan Tian and Chi Wang.
ASJC Scopus subject areas
- Biochemistry, Genetics and Molecular Biology (all)
- Immunology and Microbiology (all)