A Sparse learning machine for high-dimensional data with application to microarray gene analysis

Research output: Contribution to journalArticlepeer-review

15 Scopus citations

Abstract

Extracting features from high-dimensional data is a critically important task for pattern recognition and machine learning applications. High-dimensional data typically have much more variables than observations, and contain significant noise, missing components, or outliers. Features extracted from high-dimensional data need to be discriminative, sparse, and can capture essential characteristics of the data. In this paper, we present a way to constructing multivariate features and then classify the data into proper classes. The resulting small subset of features is nearly the best in the sense of Greenshtein's persistence; however, the estimated feature weights may be biased. We take a systematic approach for correcting the biases. We use conjugate gradient-based primal-dual interior-point techniques for large-scale problems. We apply our procedure to microarray gene analysis. The effectiveness of our method is confirmed by experimental results.

Original languageEnglish
Article number4770093
Pages (from-to)636-646
Number of pages11
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume7
Issue number4
DOIs
StatePublished - 2010

Keywords

  • High-dimensional data
  • bias
  • cancer classification
  • convex optimization
  • feature selection
  • microarray gene analysis.
  • persistence
  • primal-dual interior-point optimization

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'A Sparse learning machine for high-dimensional data with application to microarray gene analysis'. Together they form a unique fingerprint.

Cite this