FastCount: A fast gene count software for single-cell RNA-seq data

Jinpeng Liu, Xinan Liu, Ye Yu, Chi Wang, Jinze Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Motivation: The advent of single cell RNA-seq (scRNA-seq) enables scientists to characterize the transcriptomic response of cells under different conditions and understand expression heterogeneity at single cell level. One of the fundamental steps in scRNA-seq analysis is to summarize raw sequencing reads into a list of gene counts for each individual cell. However, this step remains to be most time-consuming and resource intensive in the analysis workflow due to the large amount of data produced in a scRNA-seq experiment. It is further complicated by the special handling of cell barcodes and unique molecular identifiers (UMIs) information in the read sequences. For example, the gene count summarization of 10X Chromium sequencing by standard Cell Ranger count often takes many hours to finish when running on a computing cluster. Although several alignment-free algorithms have been developed to improve efficiency, their derived gene count suffer from poor concordance with Cell Ranger count and algorithm-specific bias[1]. Results: In this work, we present a light-weight k-mer based gene count algorithm, FastCount, to support efficient UMI counts from single cell RNA-seq data. We demonstrate that FastCount is over an order of magnitude faster than Cell Ranger count while achieving competitive accuracy on 10X Genomics single cell RNA-seq data. FastCount is a stand-alone program implemented in C++. The source code is located at https://bitbucket.org/merckey/fastcount/.

Original languageEnglish
Title of host publicationProceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021
ISBN (Electronic)9781450384506
DOIs
StatePublished - Jan 18 2021
Event12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021 - Virtual, Online, United States
Duration: Aug 1 2021Aug 4 2021

Publication series

NameProceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021

Conference

Conference12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021
Country/TerritoryUnited States
CityVirtual, Online
Period8/1/218/4/21

Bibliographical note

Publisher Copyright:
© 2021 ACM.

Keywords

  • UMI quantification
  • alignment-free
  • k-mers
  • single cell RNA-seq

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Biomedical Engineering
  • Health Informatics

Fingerprint

Dive into the research topics of 'FastCount: A fast gene count software for single-cell RNA-seq data'. Together they form a unique fingerprint.

Cite this