Abstract
DNA sequencing technology has undergone substantial improvements in recent years, to the extent that Third Generation Sequencing platforms are capable of massively generating long-reads. Amplicon sequencing has been among the most popular techniques due to its wide application in diverse fields of biological sciences. However, there is a lack of software specifically designed to analyse intra-individual genetic variation using amplicon long-read data. Here, we present CCS-consensuser, an end-to-end pipeline that generates consensus sequences from amplicon sequencing using high-fidelity reads produced by PacBio circular consensus sequencing (CCS). We evaluated the concordance of the results produced using CCS + CCS-consensuser and other sequencing platforms (Illumina and Sanger), as well as accuracy using a simulated dataset. This assessment showed that CCS amplicon data coupled with CCS-consensuser can produce high-quality sequences (PHRED > 30). The pipeline resulted in high proportions of identical sequence bins for real data, achieving up to 94.94% concordance with COI Sanger sequences and 92.61% with nuclear loci Illumina sequences (considering heterozygous loci), and 95.55% with a fully phased nuclear simulated dataset. Furthermore, our pipeline can be used to detect heteroplasmy in mtDNA, cross-contamination, resolve the phase of nuclear genes in diploid organisms, and conceivably for multi-copy gene systems such as rDNA. These results not only support its potential for application in studies using haploid data such as DNA barcoding, but also demonstrate its unique capacity to explore within individual haplotype variation. Therefore, our strategy shows promise for a broad range of applications in biology and medicine that have been challenging to assess using traditional techniques.
| Original language | English |
|---|---|
| Article number | e14113 |
| Journal | Molecular Ecology Resources |
| Volume | 25 |
| Issue number | 7 |
| DOIs | |
| State | Published - Oct 2025 |
Bibliographical note
Publisher Copyright:© 2025 The Author(s). Molecular Ecology Resources published by John Wiley & Sons Ltd. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.
Funding
Funding: This study was supported by U.S. Department of Agriculture-Animal and Plant Health Inspection Service, 8130-0565-CA, 8130-0984-IA. This study was funded by the United States Department of Agriculture (USDA) Plant Protection Act 7721 and USDA Agricultural Research Service (ARS). These funds were managed as an interagency agreement between the United States Department of Agriculture-Animal and Plant Health Inspection Service (USDA-APHIS) and USDA-ARS (8130-0984-IA) and a cooperative agreement with the University of Hawaii Manoa College of Tropical Agriculture and Human Resources (8130-0565-CA). This research was also supported by the in-house appropriated USDA-ARS project Advancing Molecular Pest Management, Diagnostics and Eradication of Fruit Flies and Invasive Species (no. 2040-22430-028-000-D) and used resources provided by SCINet, USDA-ARS projects no. 0201-88888-003-000D and 0201-88888-002-000D. USDA is an equal opportunity provider and employer. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA or the U.S. Federal Government. This study was funded by the United States Department of Agriculture (USDA) Plant Protection Act 7721 and USDA Agricultural Research Service (ARS). These funds were managed as an interagency agreement between the United States Department of Agriculture‐Animal and Plant Health Inspection Service (USDA‐APHIS) and USDA‐ARS (8130‐0984‐IA) and a cooperative agreement with the University of Hawaii Manoa College of Tropical Agriculture and Human Resources (8130‐0565‐CA). This research was also supported by the in‐house appropriated USDA‐ARS project Advancing Molecular Pest Management, Diagnostics and Eradication of Fruit Flies and Invasive Species (no. 2040‐22430‐028‐000‐D) and used resources provided by SCINet, USDA‐ARS projects no. 0201‐88888‐003‐000D and 0201‐88888‐002‐000D. USDA is an equal opportunity provider and employer. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA or the U.S. Federal Government. This study was supported by U.S. Department of Agriculture‐Animal and Plant Health Inspection Service, 8130‐0565‐CA, 8130‐0984‐IA. Funding:
| Funders | Funder number |
|---|---|
| University of Hawaii Manoa College of Tropical Agriculture and Human Resources | |
| USDA-Agricultural Research Service | |
| United States Department of Agriculture | |
| Armenian Relief Society | |
| U.S. Department of Agriculture | |
| USDA-ARS | 0201‐88888‐003‐000D, 2040‐22430‐028‐000‐D |
| Animal and Plant Health Inspection Service | 8130‐0984‐IA, 8130‐0565‐CA |
| USDA SCINet scientific computing infrastructure | 0201-88888-002-000D |
Keywords
- amplicon sequencing
- circular consensus sequencing
- consensus sequence
- intraindividual variation
- long-read sequencing
ASJC Scopus subject areas
- Biotechnology
- Ecology, Evolution, Behavior and Systematics
- Genetics
Fingerprint
Dive into the research topics of 'CCS-Consensuser: A Haplotype-Aware Consensus Generator for PacBio Amplicon Sequences'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver