Order-Sensitive Clustering for Remote Homologous Protein Detection

Jin Chen, Wynne Hsu, Mong Li Lee

Research output: Contribution to journalConference articlepeer-review

Abstract

Traditional sequence alignment methods are effective in identifying homologous proteins that are highly similar. However, these approaches do not perform well for remote homologous proteins, that is, proteins whose 3D structures are similar but their sequences are not. Recent biological research reveals that protein sequences contain residues that determine the 3D structure of proteins. In this work, we investigate incorporating this information to aid in the clustering of protein databases. We capture protein residues in the form of patterns with fixed order among them. First, the significant patterns are extracted from the protein sequences. Based on the extracted patterns, we perform sequence mining to generate the order among them. Finally, we adopt a partition-based method to cluster protein sequences using the patterns and order features. Experiments on COG and SCOP40 datasets show that our new approach is able to generate high quality clusters that are similar to those determined manually by the biologists.

Original languageEnglish
Pages (from-to)26-30
Number of pages5
JournalProceedings of the International Conference on Tools with Artificial Intelligence
StatePublished - 2003
EventProceedings: 15th IEEE International Conference on Tools with artificial Intelligence - Sacramento, CA, United States
Duration: Nov 3 2003Nov 5 2003

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Order-Sensitive Clustering for Remote Homologous Protein Detection'. Together they form a unique fingerprint.

Cite this