Skip to main navigation Skip to search Skip to main content

De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture

  • Michele Di Pierro
  • , Ryan R. Cheng
  • , Erez Lieberman Aiden
  • , Peter G. Wolynes
  • , José N. Onuchic

Research output: Contribution to journalArticlepeer-review

189 Scopus citations

Abstract

Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible.

Original languageEnglish
Pages (from-to)12126-12131
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Volume114
Issue number46
DOIs
StatePublished - Nov 14 2017

Bibliographical note

Publisher Copyright:
© 2017, National Academy of Sciences. All rights reserved.

Funding

ACKNOWLEDGMENTS. We thank Erica J. Di Pierro for help in editing the manuscript. This work was supported by the Center for Theoretical Biological Physics sponsored by National Science Foundation (NSF) Grant PHY-1427654. J.N.O. was also supported by the NSF Grant CHE-1614101 and by the Welch Foundation (Grant C-1792). Additional support to P.G.W. was provided by the D. R. Bullard-Welch Chair at Rice University (Grant C-0016). E.L.A. was also supported by an NIH New Innovator Award (1DP2OD008540-01), the National Human Genome Research Institute (NHGRI) Center for Excellence for Genomic Sciences (HG006193), the Welch Foundation (Q-1866), an NVIDIA Research Center Award, an International Business Machines Corporation (IBM) University Challenge Award, a Google Research Award, a Cancer Prevention Research Institute of Texas Scholar Award (R1304), a McNair Medical Institute Scholar Award, an NIH 4D Nucleome Grant (U01HL130010), an NIH Encyclopedia of DNA Elements Mapping Center Award (UM1HG009375), and the President’s Early Career Award in Science and Engineering.

FundersFunder number
4D Nucleome Grant
New Innovator Award
Center for Theoretical Biological Physics
Google
McNair Medical Institute
President’s Early Career Award in Science and Engineering
National Institutes of Health (NIH)
E.L.A.
Texas Scholar Award
NIH EncyclopediaUM1HG009375
NHGRI Center for Excellence for Genomic SciencesHG006193, Q-1866
National Human Genome Research InstituteUM1HG009375, RM1HG006193
NIH Office of the DirectorDP2OD008540
NIH 4D NucleomeU01HL130010
National Heart, Lung, and Blood Institute (NHLBI)U01HL130010
Cancer Prevention and Research Institute of TexasR1304
National Science Foundation Arctic Social Science ProgramCHE-1614101, 1427654, 1614101
Welch FoundationC-1792
NvidiaIBM
Rice UniversityC-0016

    Keywords

    • Energy landscape theory
    • Epigenetics
    • Genomic architecture
    • Hi-C
    • Machine learning

    ASJC Scopus subject areas

    • General

    Fingerprint

    Dive into the research topics of 'De novo prediction of human chromosome structures: Epigenetic marking patterns encode genome architecture'. Together they form a unique fingerprint.

    Cite this