TY - GEN
T1 - Functional neighbors
T2 - 2008 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
AU - Bandyopadhyay, Deepak
AU - Huan, Jun
AU - Liu, Jinze
AU - Prins, Jan
AU - Snoeyink, Jack
AU - Wang, Wei
AU - Tropsha, Alexander
N1 - Copyright:
Copyright 2009 Elsevier B.V., All rights reserved.
PY - 2008
Y1 - 2008
N2 - We describe a new approach for inferring the functional relationships between non-homologous protein families by looking at statistical enrichment of alternative function predictions in classification hierarchies such as Gene Ontology (GO) and Structural Classification of Proteins (SCOP). Protein structures are represented by robust graphs, and the Fast Frequent Subgraph Mining algorithm is applied to protein families to generate sets of family-specific packing motifs, i.e. amino acid residue packing patterns shared by most family members but infrequent in other proteins. The function of a protein is inferred by identifying in it motifs characteristic of a known family. We employ these family-specific motifs to elucidate functional relationships between families in the GO and SCOP hierarchies. Specifically, we postulate that two families are functionally related if one family is statistically enriched by motifs characteristic of another family, i.e. if the number of proteins in a family containing a motif from another family is greater than expected by chance. This function inference method can help annotate proteins of unknown function, establish functional neighbors of existing families, and help specify alternate functions for known proteins.
AB - We describe a new approach for inferring the functional relationships between non-homologous protein families by looking at statistical enrichment of alternative function predictions in classification hierarchies such as Gene Ontology (GO) and Structural Classification of Proteins (SCOP). Protein structures are represented by robust graphs, and the Fast Frequent Subgraph Mining algorithm is applied to protein families to generate sets of family-specific packing motifs, i.e. amino acid residue packing patterns shared by most family members but infrequent in other proteins. The function of a protein is inferred by identifying in it motifs characteristic of a known family. We employ these family-specific motifs to elucidate functional relationships between families in the GO and SCOP hierarchies. Specifically, we postulate that two families are functionally related if one family is statistically enriched by motifs characteristic of another family, i.e. if the number of proteins in a family containing a motif from another family is greater than expected by chance. This function inference method can help annotate proteins of unknown function, establish functional neighbors of existing families, and help specify alternate functions for known proteins.
UR - http://www.scopus.com/inward/record.url?scp=58149179017&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=58149179017&partnerID=8YFLogxK
U2 - 10.1109/BIBM.2008.84
DO - 10.1109/BIBM.2008.84
M3 - Conference contribution
AN - SCOPUS:58149179017
SN - 9780769534527
T3 - Proceedings - IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
SP - 199
EP - 206
BT - Proceedings - IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
Y2 - 3 November 2008 through 5 November 2008
ER -