Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time

Suyan Tian, Chi Wang

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene's expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) were then optimized by either the coordinate descent method or the threshold gradient descent regularization method. By applying the proposed methods to simulated data and a traumatic injury dataset, we have demonstrated that the proposed methods, especially for the combination of sign average and threshold gradient descent regularization, outperform other competitive algorithms. To conclude, the proposed methods are highly recommended for studies with the objective of carrying out feature selection for longitudinal gene expression data.

Original languageEnglish
Article number1724898
JournalBioMed Research International
Volume2019
DOIs
StatePublished - 2019

Bibliographical note

Publisher Copyright:
© 2019 Suyan Tian and Chi Wang.

ASJC Scopus subject areas

  • General Biochemistry, Genetics and Molecular Biology
  • General Immunology and Microbiology

Fingerprint

Dive into the research topics of 'Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time'. Together they form a unique fingerprint.

Cite this