Ultrafast one‐pass FASTQ data preprocessing, quality control, and deduplication using fastp
Shenzhen Bioeasy Biotechnology (China) · Shenzhen Institutes of Advanced Technology
Abstract
A large amount of sequencing data is generated and processed every day with the continuous evolution of sequencing technology and the expansion of sequencing applications. One consequence of such sequencing data explosion is the increasing cost and complexity of data processing. The preprocessing of FASTQ data, which means removing adapter contamination, filtering low-quality reads, and correcting wrongly represented bases, is an indispensable but resource intensive part of sequencing data analysis. Therefore, although a lot of software applications have been developed to solve this problem, bioinformatics scientists and engineers are still pursuing faster, simpler, and more energy-efficient software. Several…
Citation impact
- FWCI
- 272.79
- Percentile
- 100%
- References
- 9
Authors
1Topics & keywords
- Data deduplication
- Computer science
- Preprocessor
- Data mining
- Software
- Adapter (computing)
- Data pre-processing
- Database