articlePLoS ONEFeb 3, 2012GOLD OA

Dirichlet Multinomial Mixtures: Generative Models for Microbial Metagenomics

University of California, Berkeley · University of Glasgow

PubMed
Indexed incrossrefdoajpubmed

Abstract

We introduce Dirichlet multinomial mixtures (DMM) for the probabilistic modelling of microbial metagenomics data. This data can be represented as a frequency matrix giving the number of times each taxa is observed in each sample. The samples have different size, and the matrix is sparse, as communities are diverse and skewed to rare taxa. Most methods used previously to classify or cluster samples have ignored these features. We describe each community by a vector of taxa probabilities. These vectors are generated from one of a finite number of Dirichlet mixture components each with different hyperparameters. Observed samples are generated through multinomial sampling. The mixture components cluster…

Citation impact

943
total citations
FWCI
6.72
Percentile
100%
References
37
Citations per year

Authors

3

Topics & keywords

Keywords
  • Metagenomics
  • Multinomial distribution
  • Mixture model
  • Biology
  • Dirichlet distribution
  • Mathematics
  • Statistics
  • Computer science
No related works found for this paper.

Funding