Abundance and distributions of eukaryote protein simple sequences.

Kim Lan Sim, Trevor P. Creamer

Research output: Contribution to journalArticlepeer-review

38 Scopus citations


Protein simple sequences are a subclass of low complexity regions of sequence that are highly enriched in one or a few residue types. Such sequences are common in transcription regulatory proteins, in structural proteins, in proteins involved in nucleic acid interactions, and in mediating protein-protein interactions. Simple sequences of 10 or more residues, containing >/=50% of a single residue type are surveyed in this work. Both eukaryote and prokaryote proteomes are investigated with emphasis on the eukaryotes. Very large numbers of such sequences are found in all organisms surveyed. It is found that eukaryotes possess far more simple sequences per protein than do the prokaryotes. Prokaryotes display a linear relationship between number of proteins containing simple sequences and proteome size, whereas it is not clear that such a relationship holds for eukaryotes. Strikingly, it is found that each eukaryote possesses its own unique distribution of simple sequences. Within those distributions it is found that simple sequences enriched in certain residue types are clearly favored, whereas others are just as clearly discriminated against. The preferences observed are not correlated with residue occurrence. An analysis of classes of proteins of known function suggests that simple sequence occurrence and distribution may be related to protein function. Based upon this analysis, the large number of simple sequences found above that would be expected from a simple statistical model, plus the known functional importance of numerous such sequences, it is postulated that eukaryotes have evolved to not only tolerate large numbers of simple sequences but also to require them.

Original languageEnglish
Pages (from-to)983-995
Number of pages13
JournalMolecular and Cellular Proteomics
Issue number12
StatePublished - Dec 2002

ASJC Scopus subject areas

  • Analytical Chemistry
  • Biochemistry
  • Molecular Biology


Dive into the research topics of 'Abundance and distributions of eukaryote protein simple sequences.'. Together they form a unique fingerprint.

Cite this