Protein simple sequences, a subset of low-complexity sequences, are regions of sequence highly enriched in one or a few residue types. Simple sequences are exceedingly common, the average being more than one per protein sequence. Despite being so common, such sequences are not well-studied. The simple sequences that have been subjected to detailed study are often found to possess important functions. Here we present a survey of protein simple sequences, generally enriched in a single residue type, with the aim of studying their conservation. We find that the majority of such simple sequences are not conserved. However, conserved protein simple sequences are relatively common, with ∼11% of the surveyed protein families possessing a conserved simple sequence. The data obtained in this study support the idea that simple sequences are conserved for functional reasons. Such functions can range from substrate binding, to mediating protein-protein interactions, to structural integrity. A perhaps surprising finding is that the residue enriching a conserved simple sequence is itself not necessarily conserved. Neither is the length of many of the highly conserved simple sequences. In the few cases where structural and functional data is available it is found that the conserved simple sequences are consistent with both local structure and function. The data presented support the idea that protein simple sequences can be conserved and have important roles in protein structure and function.
|Number of pages||10|
|Journal||Proteins: Structure, Function and Genetics|
|State||Published - Mar 1 2004|
ASJC Scopus subject areas
- Structural Biology
- Molecular Biology