Troubleshooting common errors in assemblies of long-read metagenomes
Alfred-Wegener-Institut Helmholtz-Zentrum für Polar- und Meeresforschung · Helmholtz Institute for Functional Marine Biodiversity · +8 more institutions
Abstract
Assessing the accuracy of long-read assemblies, especially from complex environmental metagenomes that include underrepresented organisms, is challenging. Here we benchmark four state-of-the-art long-read assembly software programs, HiCanu, hifiasm-meta, metaFlye and metaMDBG, on 21 PacBio HiFi metagenomes spanning mock communities, gut microbiomes and ocean samples. By quantifying read clipping events, in which long reads are systematically split during mapping to maximize the agreement with assembled contigs, we identify where assemblies diverge from their source reads. Our analyses reveal that long-read metagenome assemblies can include >40 errors per 100 million base pairs of assembled contigs, including…
Citation impact
- FWCI
- 47.03
- Percentile
- 100%
- References
- 63
Authors
4- FTFlorian TrigodetCorresponding
Alfred-Wegener-Institut Helmholtz-Zentrum für Polar- und Meeresforschung, Helmholtz Institute for Functional Marine Biodiversity
- RSRohan Sachdeva
Innovative Genomics Institute, University of California, Berkeley
- JFJillian F. Banfield
Planetary Science Institute, Lawrence Berkeley National Laboratory, Discovery Institute, Innovative Genomics Institute, University of California, Berkeley
- AMA. Murat Eren
Alfred-Wegener-Institut Helmholtz-Zentrum für Polar- und Meeresforschung, Carl von Ossietzky Universität Oldenburg, Marine Biological Laboratory, Helmholtz Institute for Functional Marine Biodiversity, Max Planck Institute for Marine Microbiology
Topics & keywords
- Troubleshooting
- Metagenomics
- Workflow
- Software
- Benchmark (surveying)
- Sequence assembly
- Path (computing)
- Genome
- Life below water