FastUniq: A Fast De Novo Duplicates Removal Tool for Paired Short Reads
Chinese Academy of Medical Sciences & Peking Union Medical College · Stony Brook University · +1 more institution
Abstract
The presence of duplicates introduced by PCR amplification is a major issue in paired short reads from next-generation sequencing platforms. These duplicates might have a serious impact on research applications, such as scaffolding in whole-genome sequencing and discovering large-scale genome variations, and are usually removed. We present FastUniq as a fast de novo tool for removal of duplicates in paired short reads. FastUniq identifies duplicates by comparing sequences between read pairs and does not require complete genome sequences as prerequisites. FastUniq is capable of simultaneously handling reads with different lengths and results in highly efficient running time, which increases linearly at an…
Citation impact
- FWCI
- 2.69
- Percentile
- 100%
- References
- 29
Authors
8- HXHaibin Xu
Chinese Academy of Medical Sciences & Peking Union Medical College
- XLXiang Luo
Chinese Academy of Medical Sciences & Peking Union Medical College
- JQJun Qian
Chinese Academy of Medical Sciences & Peking Union Medical College
- XPXiaohui Pang
Chinese Academy of Medical Sciences & Peking Union Medical College
- JSJingyuan Song
Chinese Academy of Medical Sciences & Peking Union Medical College
Topics & keywords
- Genome
- Computational biology
- Computer science
- k-mer
- DNA sequencing
- Sequence assembly
- Whole genome sequencing
- Biology