articleBMC BioinformaticsJan 28, 2011GOLD OA

Removing Noise From Pyrosequenced Amplicons

CQChristopher QuinceALAnders LanzenRJRussell J DavenportPJPeter J Turnbaugh

University of Glasgow · Centre Jean Perrin

PubMed
Indexed incrossrefdoajpubmed

Abstract

Background

In many environmental genomics applications a homologous region of DNA from a diverse sample is first amplified by PCR and then sequenced. The next generation sequencing technology, 454 pyrosequencing, has allowed much larger read numbers from PCR amplicons than ever before. This has revolutionised the study of microbial diversity as it is now possible to sequence a substantial fraction of the 16S rRNA genes in a community. However, there is a growing realisation that because of the large read numbers and the lack of consensus sequences it is vital to distinguish noise from true sequence diversity in this data. Otherwise this leads to inflated estimates of the number of types or operational taxonomic units (OTUs) present. Three sources of error are important: sequencing error, PCR single base substitutions and PCR chimeras. We present AmpliconNoise, a development of the PyroNoise algorithm that is capable of separately removing 454 sequencing errors and PCR single base errors. We also introduce a novel chimera removal program, Perseus, that exploits the sequence abundances associated with pyrosequencing data. We use data sets where samples of known diversity have been amplified and sequenced to quantify the effect of each of the sources of error on OTU inflation and to validate these algorithms.

Results

AmpliconNoise outperforms alternative algorithms substantially reducing per base error rates for both the GS FLX and latest Titanium protocol. All three sources of error lead to inflation of diversity estimates. In particular, chimera formation has a hitherto unrealised importance which varies according to amplification protocol. We show that AmpliconNoise allows accurate estimates of OTU number. Just as importantly AmpliconNoise generates the right OTUs even at low sequence differences. We demonstrate that Perseus has very high sensitivity, able to find 99% of chimeras, which is critical when these are present at high frequencies.

Citation impact

1,597
total citations
FWCI
70.50
Percentile
100%
References
30
Citations per year

Authors

4
  • CQ
    Christopher QuinceCorresponding

    University of Glasgow

  • AL
    Anders Lanzen

    Centre Jean Perrin, University of Glasgow

  • RJ
    Russell J Davenport

    Centre Jean Perrin, University of Glasgow

  • PJ
    Peter J Turnbaugh

    Centre Jean Perrin, University of Glasgow

Topics & keywords

Keywords
  • Amplicon
  • Pyrosequencing
  • Biology
  • Computational biology
  • Ion semiconductor sequencing
  • DNA sequencing
  • Genomics
  • Genetics
UN Sustainable Development Goals
  • Life in Land
No related works found for this paper.

Funding