SDA: A semi-parametric differential abundance analysis method for metabolomics and proteomics data

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Background: Identifying differentially abundant features between different experimental groups is a common goal for many metabolomics and proteomics studies. However, analyzing data from mass spectrometry (MS) is difficult because the data may not be normally distributed and there is often a large fraction of zero values. Although several statistical methods have been proposed, they either require the data normality assumption or are inefficient. Results: We propose a new semi-parametric differential abundance analysis (SDA) method for metabolomics and proteomics data from MS. The method considers a two-part model, a logistic regression for the zero proportion and a semi-parametric log-linear model for the possibly non-normally distributed non-zero values, to characterize data from each feature. A kernel-smoothed likelihood method is developed to estimate model coefficients and a likelihood ratio test is constructed for differential abundant analysis. The method has been implemented into an R package, SDAMS, which is available at https://www.bioconductor.org/packages/release/bioc/HTML/SDAMS.HTML. Conclusion: By introducing the two-part semi-parametric model, SDA is able to handle both non-normally distributed data and large fraction of zero values in a MS dataset. It also allows for adjustment of covariates. Simulations and real data analyses demonstrate that SDA outperforms existing methods.

Original languageEnglish
Article number501
JournalBMC Bioinformatics
Volume20
Issue number1
DOIs
StatePublished - Oct 17 2019

Bibliographical note

Funding Information:
This work was supported by National Institutes of Health [1R03CA211835, 5P20GM103436-15, 1P01CA163223-01A1], the Biostatistics and Bioinformatics and Redox Metabolism Shared Resource Facilities of the University of Kentucky Markey Cancer Center [P30CA177558]. The National Institutes of Health played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Publisher Copyright:
© 2019 The Author(s).

Keywords

  • Differential abundance analysis
  • Kernel smoothing
  • Metabolomics
  • Proteomics
  • Semi-parametric log-linear model

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'SDA: A semi-parametric differential abundance analysis method for metabolomics and proteomics data'. Together they form a unique fingerprint.

Cite this