CCS-Consensuser: A Haplotype-Aware Consensus Generator for PacBio Amplicon Sequences

  • Carlos Congrains
  • , Forest Bremer
  • , Julian R. Dupuis
  • , Norman B. Barr
  • , Ivonne J. Garzón-Orduña
  • , Daniel Rubinoff
  • , Camiel Doorenweerd
  • , Michael San Jose
  • , Kimberley Morris
  • , Angela Kauwe
  • , Scott Geib

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

DNA sequencing technology has undergone substantial improvements in recent years, to the extent that Third Generation Sequencing platforms are capable of massively generating long-reads. Amplicon sequencing has been among the most popular techniques due to its wide application in diverse fields of biological sciences. However, there is a lack of software specifically designed to analyse intra-individual genetic variation using amplicon long-read data. Here, we present CCS-consensuser, an end-to-end pipeline that generates consensus sequences from amplicon sequencing using high-fidelity reads produced by PacBio circular consensus sequencing (CCS). We evaluated the concordance of the results produced using CCS + CCS-consensuser and other sequencing platforms (Illumina and Sanger), as well as accuracy using a simulated dataset. This assessment showed that CCS amplicon data coupled with CCS-consensuser can produce high-quality sequences (PHRED > 30). The pipeline resulted in high proportions of identical sequence bins for real data, achieving up to 94.94% concordance with COI Sanger sequences and 92.61% with nuclear loci Illumina sequences (considering heterozygous loci), and 95.55% with a fully phased nuclear simulated dataset. Furthermore, our pipeline can be used to detect heteroplasmy in mtDNA, cross-contamination, resolve the phase of nuclear genes in diploid organisms, and conceivably for multi-copy gene systems such as rDNA. These results not only support its potential for application in studies using haploid data such as DNA barcoding, but also demonstrate its unique capacity to explore within individual haplotype variation. Therefore, our strategy shows promise for a broad range of applications in biology and medicine that have been challenging to assess using traditional techniques.

Original languageEnglish
Article numbere14113
JournalMolecular Ecology Resources
Volume25
Issue number7
DOIs
StatePublished - Oct 2025

Bibliographical note

Publisher Copyright:
© 2025 The Author(s). Molecular Ecology Resources published by John Wiley & Sons Ltd. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.

Funding

Funding: This study was supported by U.S. Department of Agriculture-Animal and Plant Health Inspection Service, 8130-0565-CA, 8130-0984-IA. This study was funded by the United States Department of Agriculture (USDA) Plant Protection Act 7721 and USDA Agricultural Research Service (ARS). These funds were managed as an interagency agreement between the United States Department of Agriculture-Animal and Plant Health Inspection Service (USDA-APHIS) and USDA-ARS (8130-0984-IA) and a cooperative agreement with the University of Hawaii Manoa College of Tropical Agriculture and Human Resources (8130-0565-CA). This research was also supported by the in-house appropriated USDA-ARS project Advancing Molecular Pest Management, Diagnostics and Eradication of Fruit Flies and Invasive Species (no. 2040-22430-028-000-D) and used resources provided by SCINet, USDA-ARS projects no. 0201-88888-003-000D and 0201-88888-002-000D. USDA is an equal opportunity provider and employer. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA or the U.S. Federal Government. This study was funded by the United States Department of Agriculture (USDA) Plant Protection Act 7721 and USDA Agricultural Research Service (ARS). These funds were managed as an interagency agreement between the United States Department of Agriculture‐Animal and Plant Health Inspection Service (USDA‐APHIS) and USDA‐ARS (8130‐0984‐IA) and a cooperative agreement with the University of Hawaii Manoa College of Tropical Agriculture and Human Resources (8130‐0565‐CA). This research was also supported by the in‐house appropriated USDA‐ARS project Advancing Molecular Pest Management, Diagnostics and Eradication of Fruit Flies and Invasive Species (no. 2040‐22430‐028‐000‐D) and used resources provided by SCINet, USDA‐ARS projects no. 0201‐88888‐003‐000D and 0201‐88888‐002‐000D. USDA is an equal opportunity provider and employer. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA or the U.S. Federal Government. This study was supported by U.S. Department of Agriculture‐Animal and Plant Health Inspection Service, 8130‐0565‐CA, 8130‐0984‐IA. Funding:

FundersFunder number
University of Hawaii Manoa College of Tropical Agriculture and Human Resources
USDA-Agricultural Research Service
United States Department of Agriculture
Armenian Relief Society
U.S. Department of Agriculture
USDA-ARS0201‐88888‐003‐000D, 2040‐22430‐028‐000‐D
Animal and Plant Health Inspection Service8130‐0984‐IA, 8130‐0565‐CA
USDA SCINet scientific computing infrastructure0201-88888-002-000D

    Keywords

    • amplicon sequencing
    • circular consensus sequencing
    • consensus sequence
    • intraindividual variation
    • long-read sequencing

    ASJC Scopus subject areas

    • Biotechnology
    • Ecology, Evolution, Behavior and Systematics
    • Genetics

    Fingerprint

    Dive into the research topics of 'CCS-Consensuser: A Haplotype-Aware Consensus Generator for PacBio Amplicon Sequences'. Together they form a unique fingerprint.

    Cite this