HiFiAdapterFilt, a memory efficient read processing pipeline, prevents occurrence of adapter sequence in PacBio HiFi reads and their negative impacts on genome assembly
Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center · Oak Ridge Associated Universities
Abstract
Pacific Biosciences HiFi read technology is currently the industry standard for high accuracy long-read sequencing that has been widely adopted by large sequencing and assembly initiatives for generation of de novo assemblies in non-model organisms. Though adapter contamination filtering is routine in traditional short-read analysis pipelines, it has not been widely adopted for HiFi workflows.
Analysis of 55 publicly available HiFi datasets revealed that a read-sanitation step to remove sequence artifacts derived from PacBio library preparation from read pools is necessary as adapter sequences can be erroneously integrated into assemblies.
Citation impact
- FWCI
- 21.48
- Percentile
- 100%
- References
- 24
Authors
4- SBSheina B. SimCorresponding
Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center
- RLRenée L. Corpuz
Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center
- TJTyler J. Simmonds
Oak Ridge Associated Universities, Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center
- SMScott M. Geib
Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center
Topics & keywords
- Adapter (computing)
- Computer science
- Pipeline (software)
- Sequence assembly
- Computational biology
- Biology
- Workflow
- Computer hardware
Funding
- UDU.S. Department of EnergyAwards: DE-SC0014664, 0500-00093-001-00-D, SC0014664
- UDU.S. Department of AgricultureAwards: DE-SC0014664, 0500-00093-001-00-D
- OROak Ridge Associated UniversitiesAwards: DE-SC0014664, 0500-00093-001-00-D
- OROak Ridge Institute for Science and EducationAwards: DE-SC0014664, SC0014664, 0500-00093-001-00-D
- ARAgricultural Research ServiceAwards: DE-SC0014664, 2040-22430-027-00-D, 0500-00093-001-00-D