PANDAseq: paired-end assembler for illumina sequences
Abstract
Illumina paired-end reads are used to analyse microbial communities by targeting amplicons of the 16S rRNA gene. Publicly available tools are needed to assemble overlapping paired-end reads while correcting mismatches and uncalled bases; many errors could be corrected to obtain higher sequence yields using quality information.
PANDAseq assembles paired-end reads rapidly and with the correction of most errors. Uncertain error corrections come from reads with many low-quality bases identified by upstream processing. Benchmarks were done using real error masks on simulated data, a pure source template, and a pooled template of genomic DNA from known organisms. PANDAseq assembled reads more rapidly and with reduced error incorporation compared to alternative methods.
Citation impact
- FWCI
- 56.73
- Percentile
- 100%
- References
- 12
Authors
5Topics & keywords
- Amplicon
- Computer science
- Sequence (biology)
- Error detection and correction
- k-mer
- Amplicon sequencing
- Sequence assembly
- Computational biology