articlePLoS Computational BiologyApr 3, 2014GOLD OA

Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible

Stanford University

PubMed
Indexed inarxivcrossrefdoajpubmed

Abstract

Current practice in the normalization of microbiome count data is inefficient in the statistical sense. For apparently historical reasons, the common approach is either to use simple proportions (which does not address heteroscedasticity) or to use rarefying of counts, even though both of these approaches are inappropriate for detection of differentially abundant species. Well-established statistical theory is available that simultaneously accounts for library size differences and biological variability using an appropriate mixture model. Moreover, specific implementations for DNA sequencing read count data (based on a Negative Binomial model for instance) are already available in RNA-Seq focused R packages…

Citation impact

3,068
total citations
FWCI
74.69
Percentile
100%
References
93
Citations per year

Authors

2

Topics & keywords

Keywords
  • Negative binomial distribution
  • Heteroscedasticity
  • Count data
  • Microbiome
  • Computer science
  • Normalization (sociology)
  • R package
  • Mixture model
No related works found for this paper.

Funding