Long-read sequencing of the zebrafish genome reorganizes genomic architecture

Yelena Chernyavskaya, Xiaofei Zhang, Jinze Liu, Jessica Blackburn

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Background: Nanopore sequencing technology has revolutionized the field of genome biology with its ability to generate extra-long reads that can resolve regions of the genome that were previously inaccessible to short-read sequencing platforms. Over 50% of the zebrafish genome consists of difficult to map, highly repetitive, low complexity elements that pose inherent problems for short-read sequencers and assemblers. Results: We used long-read nanopore sequencing to generate a de novo assembly of the zebrafish genome and compared our assembly to the current reference genome, GRCz11. The new assembly identified 1697 novel insertions and deletions over one kilobase in length and placed 106 previously unlocalized scaffolds. We also discovered additional sites of retrotransposon integration previously unreported in GRCz11 and observed the expression of these transposable elements in adult zebrafish under physiologic conditions, implying they have active mobility in the zebrafish genome and contribute to the ever-changing genomic landscape. Conclusions: We used nanopore sequencing to improve upon and resolve the issues plaguing the current zebrafish reference assembly, GRCz11. Zebrafish is a prominent model of human disease, and our corrected assembly will be useful for studies relying on interspecies comparisons and precise linkage of genetic events to disease phenotypes.

Original languageEnglish
Article number116
JournalBMC Genomics
Volume23
Issue number1
DOIs
StatePublished - Dec 2022

Bibliographical note

Publisher Copyright:
© 2022, The Author(s).

Funding

Funding supporting this project was provided by the National Institutes of Health DP2CA228043 and the Kentucky Pediatric Cancer Research Trust Foundation (to JSB). This research was also supported by the Biostatistics and Bioinformatics Shared Resource Facility of the University of Kentucky Markey Cancer Center (P30CA177558) and the VCU Massey Cancer Center Bioinformatics Core (P30CA016059). These funding bodies played no role in the design of the study, the collection, analysis, and interpretation of data, or in the writing of the manuscript.

FundersFunder number
The Markey Biostatistics and Bioinformatics Shared Resource Facility
Kentucky Pediatric Cancer Research Trust Foundation
National Institutes of Health (NIH)
National Childhood Cancer Registry – National Cancer InstituteDP2CA228043
University of Kentucky Markey Cancer CenterP30CA177558, P30CA016059

    Keywords

    • Danio rerio
    • MinION
    • Nanopore
    • Reference assembly
    • Transposon

    ASJC Scopus subject areas

    • Biotechnology
    • Genetics

    Fingerprint

    Dive into the research topics of 'Long-read sequencing of the zebrafish genome reorganizes genomic architecture'. Together they form a unique fingerprint.

    Cite this