TY - JOUR
T1 - Construction and evaluation of a new rat reference genome assembly, GRCr8, from long reads and long-range scaffolding
AU - Li, Kai
AU - Smith, Melissa L.
AU - Blazier, J. Chris
AU - Kochan, Kelli J.
AU - Wood, Jonathan M.D.
AU - Howe, Kerstin
AU - Kwitek, Anne E.
AU - Dwinell, Melinda R.
AU - Chen, Hao
AU - Ciosek, Julia L.
AU - Masterson, Patrick
AU - Murphy, Terence D.
AU - Kalbfleisch, Theodore S.
AU - Doris, Peter A.
N1 - Publisher Copyright:
© 2024 Li et al.
PY - 2024/11
Y1 - 2024/11
N2 - We report the construction and analysis of a new reference genome assembly for Rattus norvegicus, the laboratory rat, a widely used experimental animal model organism. The assembly has been adopted as the rat reference assembly by the Genome Reference Consortium and is named GRCr8. The assembly has employed 40× Pacific Biosciences (PacBio) HiFi sequencing coverage and scaffolding using optical mapping and Hi-C. We used genomic DNA from a male BN/NHsdMcwi (BN) rat of the same strain and from the same colony as the prior reference assembly, mRatBN7.2. The assembly is at chromosome level with 98.7% of the sequence assigned to chromosomes. All chromosomes have increased in size compared with the prior assembly and k-mer analysis indicates that the subject animal is fully inbred and that the genome is represented as a single haploid assembly. Notable increases are observed in Chromosomes 3, 11, and 12 in the prospective rDNA regions. In addition, Chr Y has increased threefold in size and is more consistent with the rat karyotype than previous assemblies. Several other chromosomes have grown by the incorporation of sizable discrete new blocks. These contain highly repetitive sequences and encode numerous previously unannotated genes. In addition, centromeric sequences are incorporated in most chromosomes. Genome annotation has been performed by NCBI RefSeq, which confirms improvement in assembly quality and adds more than 1100 new protein coding genes. PacBio Iso-Seq data have been acquired from multiple tissues of the subject animal and are released concurrently with the new assembly to aid further analyses.
AB - We report the construction and analysis of a new reference genome assembly for Rattus norvegicus, the laboratory rat, a widely used experimental animal model organism. The assembly has been adopted as the rat reference assembly by the Genome Reference Consortium and is named GRCr8. The assembly has employed 40× Pacific Biosciences (PacBio) HiFi sequencing coverage and scaffolding using optical mapping and Hi-C. We used genomic DNA from a male BN/NHsdMcwi (BN) rat of the same strain and from the same colony as the prior reference assembly, mRatBN7.2. The assembly is at chromosome level with 98.7% of the sequence assigned to chromosomes. All chromosomes have increased in size compared with the prior assembly and k-mer analysis indicates that the subject animal is fully inbred and that the genome is represented as a single haploid assembly. Notable increases are observed in Chromosomes 3, 11, and 12 in the prospective rDNA regions. In addition, Chr Y has increased threefold in size and is more consistent with the rat karyotype than previous assemblies. Several other chromosomes have grown by the incorporation of sizable discrete new blocks. These contain highly repetitive sequences and encode numerous previously unannotated genes. In addition, centromeric sequences are incorporated in most chromosomes. Genome annotation has been performed by NCBI RefSeq, which confirms improvement in assembly quality and adds more than 1100 new protein coding genes. PacBio Iso-Seq data have been acquired from multiple tissues of the subject animal and are released concurrently with the new assembly to aid further analyses.
UR - http://www.scopus.com/inward/record.url?scp=85209718679&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85209718679&partnerID=8YFLogxK
U2 - 10.1101/gr.279292.124
DO - 10.1101/gr.279292.124
M3 - Article
C2 - 39516046
AN - SCOPUS:85209718679
SN - 1088-9051
VL - 34
SP - 2081
EP - 2093
JO - Genome Research
JF - Genome Research
IS - 11
ER -