Expanding Direct Coupling Analysis to Identify Heterodimeric Interfaces from Limited Protein Sequence Data

Kareem M. Mehrabiani, Ryan R. Cheng, José N. Onuchic

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


Direct coupling analysis (DCA) is a global statistical approach that uses information encoded in protein sequence data to predict spatial contacts in a three-dimensional structure of a folded protein. DCA has been widely used to predict the monomeric fold at amino acid resolution and to identify biologically relevant interaction sites within a folded protein. Going beyond single proteins, DCA has also been used to identify spatial contacts that stabilize the interaction in protein complex formation. However, extracting this higher order information necessary to predict dimer contacts presents a significant challenge. A DCA evolutionary signal is much stronger at the single protein level (intraprotein contacts) than at the protein-protein interface (interprotein contacts). Therefore, if DCA-derived information is to be used to predict the structure of these complexes, there is a need to identify statistically significant DCA predictions. We propose a simple Z-score measure that can filter good predictions despite noisy, limited data. This new methodology not only improves our prediction ability but also provides a quantitative measure for the validity of the prediction.

Original languageEnglish
Pages (from-to)11408-11417
Number of pages10
JournalJournal of Physical Chemistry B
Issue number41
StatePublished - Oct 21 2021

Bibliographical note

Publisher Copyright:

ASJC Scopus subject areas

  • Physical and Theoretical Chemistry
  • Surfaces, Coatings and Films
  • Materials Chemistry


Dive into the research topics of 'Expanding Direct Coupling Analysis to Identify Heterodimeric Interfaces from Limited Protein Sequence Data'. Together they form a unique fingerprint.

Cite this