On position-specific scoring matrix for protein function prediction

Jong Cheol Jeong, Xiaotong Lin, Xue Wen Chen

Research output: Contribution to journalArticlepeer-review

165 Scopus citations

Abstract

While genome sequencing projects have generated tremendous amounts of protein sequence data for a vast number of genomes, substantial portions of most genomes are still unannotated. Despite the success of experimental methods for identifying protein functions, they are often lab intensive and time consuming. Thus, it is only practical to use in silico methods for the genome-wide functional annotations. In this paper, we propose new features extracted from protein sequence only and machine learning-based methods for computational function prediction. These features are derived from a position-specific scoring matrix, which has shown great potential in other bininformatics problems. We evaluate these features using four different classifiers and yeast protein data. Our experimental results show that features derived from the position-specific scoring matrix are appropriate for automatic function annotation.

Original languageEnglish
Article number5582078
Pages (from-to)308-315
Number of pages8
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume8
Issue number2
DOIs
StatePublished - 2011

Bibliographical note

Funding Information:
This work was supported by the US National Science Foundation (NSF) Award IIS-0644366.

Keywords

  • Clustering
  • and association rules
  • classification
  • data mining
  • feature extraction or construction
  • mining methods and algorithms

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'On position-specific scoring matrix for protein function prediction'. Together they form a unique fingerprint.

Cite this