Ultrafast one‐pass FASTQ data preprocessing, quality control, and deduplication using fastp

Chen, Shifu

doi:10.1002/imt2.107

articleiMetaMay 1, 2023GOLD OA

Ultrafast one‐pass FASTQ data preprocessing, quality control, and deduplication using fastp

SCShifu Chen

Shenzhen Bioeasy Biotechnology (China) · Shenzhen Institutes of Advanced Technology

PubMed

Indexed incrossrefdoajpubmed

Abstract

A large amount of sequencing data is generated and processed every day with the continuous evolution of sequencing technology and the expansion of sequencing applications. One consequence of such sequencing data explosion is the increasing cost and complexity of data processing. The preprocessing of FASTQ data, which means removing adapter contamination, filtering low-quality reads, and correcting wrongly represented bases, is an indispensable but resource intensive part of sequencing data analysis. Therefore, although a lot of software applications have been developed to solve this problem, bioinformatics scientists and engineers are still pursuing faster, simpler, and more energy-efficient software. Several…

Citation impact

1,879

total citations

FWCI: 272.79
Percentile: 100%
References: 9

Citations per year

Authors

1

SC
Shifu ChenCorresponding
Shenzhen Bioeasy Biotechnology (China), Shenzhen Institutes of Advanced Technology

Topics & keywords

Topics

Keywords

Data deduplication
Computer science
Preprocessor
Data mining
Software
Adapter (computing)
Data pre-processing
Database

No related works found for this paper.