Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys
Cornell University · The University of Queensland · +3 more institutions
Abstract
Taxonomic classification of the thousands-millions of 16S rRNA gene sequences generated in microbiome studies is often achieved using a naïve Bayesian classifier (for example, the Ribosomal Database Project II (RDP) classifier), due to favorable trade-offs among automation, speed and accuracy. The resulting classification depends on the reference sequences and taxonomic hierarchy used to train the model; although the influence of primer sets and classification algorithms have been explored in detail, the influence of training set has not been characterized. We compared classification results obtained using three different publicly available databases as training sets, applied to five different bacterial 16S…
Citation impact
- FWCI
- 10.38
- Percentile
- 100%
- References
- 31
Authors
9Topics & keywords
- Biology
- Phylum
- 16S ribosomal RNA
- Phylotype
- Bacterial taxonomy
- Metagenomics
- Ribosomal RNA
- Pyrosequencing