LoRDEC: accurate and efficient long read error correction
Centre National de la Recherche Scientifique · University of Helsinki · +3 more institutions
Abstract
MOTIVATION: PacBio single molecule real-time sequencing is a third-generation sequencing technique producing long reads, with comparatively lower throughput and higher error rate. Errors include numerous indels and complicate downstream analysis like mapping or de novo assembly. A hybrid strategy that takes advantage of the high accuracy of second-generation short reads has been proposed for correcting long reads. Mapping of short reads on long reads provides sufficient coverage to eliminate up to 99% of errors, however, at the expense of prohibitive running times and considerable amounts of disk and memory space. RESULTS: We present LoRDEC, a hybrid error correction method that builds a succinct de Bruijn…
Citation impact
- FWCI
- 12.52
- Percentile
- 100%
- References
- 33
Authors
2- LSLeena SalmelaCorresponding
Centre National de la Recherche Scientifique, University of Helsinki, Université de Montpellier, Helsinki Institute for Information Technology, Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier
- ÉRÉric RivalsCorresponding
Centre National de la Recherche Scientifique, University of Helsinki, Université de Montpellier, Helsinki Institute for Information Technology, Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier
Topics & keywords
- Computer science
- Error detection and correction
- De Bruijn sequence
- De Bruijn graph
- Graph
- Sequence assembly
- Traverse
- Algorithm