A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats

Tristan V. de Jong, Yanchao Pan, Pasi Rastas, Daniel Munro, Monika Tutaj, Huda Akil, Chris Benner, Denghui Chen, Apurva S. Chitre, William Chow, Vincenza Colonna, Clifton L. Dalgard, Wendy M. Demos, Peter A. Doris, Erik Garrison, Aron M. Geurts, Hakan M. Gunturkun, Victor Guryev, Thibaut Hourlier, Kerstin HoweJun Huang, Ted Kalbfleisch, Panjun Kim, Ling Li, Spencer Mahaffey, Fergal J. Martin, Pejman Mohammadi, Ayse Bilge Ozel, Oksana Polesskaya, Michal Pravenec, Pjotr Prins, Jonathan Sebat, Jennifer R. Smith, Leah C. Solberg Woods, Boris Tabakoff, Alan Tracey, Marcela Uliano-Silva, Flavia Villani, Hongyang Wang, Burt M. Sharp, Francesca Telese, Zhihua Jiang, Laura Saba, Xusheng Wang, Terence D. Murphy, Abraham A. Palmer, Anne E. Kwitek, Melinda R. Dwinell, Robert W. Williams, Jun Z. Li, Hao Chen

Research output: Contribution to journalArticlepeer-review

Abstract

The seventh iteration of the reference genome assembly for Rattus norvegicus—mRatBN7.2—corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared with its predecessor. Gene annotations are now more complete, improving the mapping precision of genomic, transcriptomic, and proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ∼20.0 million sequence variations, of which 18,700 are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.

Original languageEnglish
Article number100527
JournalCell Genomics
Volume4
Issue number4
DOIs
StatePublished - Apr 10 2024

Bibliographical note

Publisher Copyright:
© 2024 The Authors

Funding

This work is supported by the Academy of Finland (grant no. 343656) to P.R.; NIH NHLBI R01HL064541 and P01HL149620 and Office of the Director R24OD024617 to M.T.; NIH NIDA U01DA043098 to H.A.; NIH NIDA U01DA051972 to C.B.; NIH NHLBI R01HL064541 and Office of the Director R24OD024617 to W.M.D.; NIH NHLBI P01HL149620 and Office of the Director R24OD024617 to A.M.G.; Wellcome Trust WT222155/Z/20/Z to T.H.; NIH R01HG011252 to T.K.; Wellcome Trust WT222155/Z/20/Z to F.J.M.; NIH R01GM140287 to P.M.; a program from the National Institute for Research of Metabolic and Cardiovascular Diseases (Program EXCELES, ID project no. LX22NPO5104) funded by the European Union \u2013 Next Generation EU to M.P.; National Institute of Food and Agriculture, United States Department of Agriculture (2016-67015-24470/2020-67015-31733/2022-51300-38058/2023-67015-39566/2023-67015-40080) to Z.J.; NIH NIDA U01DA051234 to J.S.; NIH NHLBI R01HL064541, NHGRI U24HG010859, and Office of the Director R24OD024617 to J.R.S.; NIH NIDA P50DA037844 to L.C.S.W. (HS rats); NIH NIAAA R24AA013162 to B.T.; NIH NIDA U01DA050239 and U01DA051972 to F.T.; NIH NIDA P30DA044223 to L.S.; the National Center for Biotechnology Information of the National Library of Medicine (NLM), National Institutes of Health to T.D.M.; NIH NHLBI R01HL064541 and P01HL149620, NHGRI U24HG010859, and Office of the Director R24OD024617 to A.E.K.; NIH Office of the Director grant R24OD024617 to M.R.D. (HRDP); NIH NIDA U01DA047638 and P30DA044223 to R.W.W.; NIH NIDA U01DA043098 to J.Z.L.; and NIH NIDA U01DA047638, P50DA037844, and R01DA048017 to H.C. The majority of the computation for this work was performed on the University of Tennessee Infrastructure for Scientific Applications and Advanced Computing (ISAAC) computational resources. This work is dedicated to the memory of Dr. Mary Shimoyama. Conceptualization, A.A.P. R.W.W. J.Z.L. and H.C.; formal analysis, T.V.d.J. Y.P. P.R. D.M. M.T. X.W. T.D.M. J.Z.L. and H.C.; visualization, T.V.d.J. Y.P. P.R. D.M. F.T. Z.J. L.S. X.W. and J.Z.L.; investigation, H.A. C.B. D.C. A.S.C. W.C. V.C. W.M.D. P.A.D. E.G. A.M.G. H.M.G. V.G. T.H. K.H. J.H. T.K. P.K. L.L. S.M. F.J.M. P.M. A.B.O. O.P. P.P. J.S. J.R.S. L.C.S.W. B.T. A.T. M.U.-S. F.V. H.W. F.T. Z.J. and L.S.; resources, C.L.D. M.P. A.E.K. and M.R.D.; data curation, W.M.D. A.M.G. J.R.S. A.E.K. and M.R.D.; writing \u2013 original draft, T.V.d.J. Y.P. P.R. D.M. T.H. L.S. X.W. T.D.M. J.Z.L. and H.C.; writing \u2013 review & editing, A.M.G. B.M.S. L.S. X.W. T.D.M. A.A.P. A.E.K. M.R.D. R.W.W. J.Z.L. and H.C.; supervision, J.Z.L. and H.C. The authors declare no competing interests. This work is supported by the Academy of Finland (grant no. 343656 ) to P.R.; NIH NHLBI R01HL064541 and P01HL149620 and Office of the Director R24OD024617 to M.T.; NIH NIDA U01DA043098 to H.A.; NIH NIDA U01DA051972 to C.B.; NIH NHLBI R01HL064541 and Office of the Director R24OD024617 to W.M.D.; NIH NHLBI P01HL149620 and Office of the Director R24OD024617 to A.M.G.; Wellcome Trust WT222155/Z/20/Z to T.H.; NIH R01HG011252 to T.K.; Wellcome Trust WT222155/Z/20/Z to F.J.M.; NIH R01GM140287 to P.M.; a program from the National Institute for Research of Metabolic and Cardiovascular Diseases (Program EXCELES, ID project no. LX22NPO5104 ) funded by the European Union \u2013 Next Generation EU to M.P.; NIH NIDA U01DA051234 to J.S.; NIH NHLBI R01HL064541 , NHGRI U24HG010859 , and Office of the Director R24OD024617 to J.R.S.; NIH NIDA P50DA037844 to L.C.S.W. (HS rats); NIH NIAAA R24AA013162 to B.T.; NIH NIDA U01DA050239 and U01DA051972 to F.T.; NIH NIDA P30DA044223 to L.S.; the National Center for Biotechnology Information of the National Library of Medicine (NLM), National Institutes of Health to T.D.M.; NIH NHLBI R01HL064541 and P01HL149620 , NHGRI U24HG010859 , and Office of the Director R24OD024617 to A.E.K.; NIH Office of the Director grant R24OD024617 to M.R.D. (HRDP); NIH NIDA U01DA047638 and P30DA044223 to R.W.W.; NIH NIDA U01DA043098 to J.Z.L.; and NIH NIDA U01DA047638 , P50DA037844 , and R01DA048017 to H.C. The majority of the computation for this work was performed on the University of Tennessee Infrastructure for Scientific Applications and Advanced Computing (ISAAC) computational resources.

FundersFunder number
US Department of Agriculture National Institute of Food and Agriculture, Agriculture and Food Research Initiative
University of Tennessee Infrastructure for Scientific Applications and Advanced Computing
Fair Isaac Corporation
U.S. National Library of Medicine
Research Council of Finland343656
Research Council of Finland
Office of the DirectorR24OD024617
Office of the Director
National Institutes of Health (NIH)R01HG011252, R01DA048017, U01DA047638, R01GM140287
National Institutes of Health (NIH)
Wellcome TrustWT222155/Z/20/Z
Wellcome Trust
NIH/NIDAU01DA043098, U01DA051972
DBR/NIAAA/NIHR24AA013162, U01DA050239, P30DA044223
National Heart, Lung, and Blood Institute (NHLBI)P01HL149620, R01HL064541
National Heart, Lung, and Blood Institute (NHLBI)
U.S. Department of AgricultureU01DA051234, 2016-67015-24470/2020-67015-31733/2022-51300-38058/2023-67015-39566/2023-67015-40080
U.S. Department of Agriculture
European CommissionU01DA051234
European Commission
National Institute for Research of Metabolic and Cardiovascular DiseasesLX22NPO5104
National Human Genome Research InstituteP50DA037844, U24HG010859
National Human Genome Research Institute

    Keywords

    • Rnor_6.0
    • genetic map
    • heterogeneous stock
    • hybrid rat diversity panel
    • inbred strains
    • mRatBN7.2
    • phylogenetic tree
    • rat
    • recombinant inbred
    • reference genome

    ASJC Scopus subject areas

    • Biochemistry, Genetics and Molecular Biology (miscellaneous)
    • Genetics

    Fingerprint

    Dive into the research topics of 'A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats'. Together they form a unique fingerprint.

    Cite this