Troubleshooting common errors in assemblies of long-read metagenomes

Trigodet, Florian; Sachdeva, Rohan; Banfield, Jillian F.; Eren, A. Murat

doi:10.1038/s41587-025-02971-8

articleNature BiotechnologyJan 2, 2026HYBRID OA

Troubleshooting common errors in assemblies of long-read metagenomes

FTFlorian Trigodet RSRohan Sachdeva JFJillian F. Banfield AMA. Murat Eren

Alfred-Wegener-Institut Helmholtz-Zentrum für Polar- und Meeresforschung · Helmholtz Institute for Functional Marine Biodiversity · +8 more institutions

PubMed

Indexed incrossrefpubmed

Abstract

Assessing the accuracy of long-read assemblies, especially from complex environmental metagenomes that include underrepresented organisms, is challenging. Here we benchmark four state-of-the-art long-read assembly software programs, HiCanu, hifiasm-meta, metaFlye and metaMDBG, on 21 PacBio HiFi metagenomes spanning mock communities, gut microbiomes and ocean samples. By quantifying read clipping events, in which long reads are systematically split during mapping to maximize the agreement with assembled contigs, we identify where assemblies diverge from their source reads. Our analyses reveal that long-read metagenome assemblies can include >40 errors per 100 million base pairs of assembled contigs, including…

Citation impact

5

total citations

FWCI: 47.03
Percentile: 100%
References: 63

Too recent for citation history.

Authors

4

Topics & keywords

Topics

Keywords

Troubleshooting
Metagenomics
Workflow
Software
Benchmark (surveying)
Sequence assembly
Path (computing)
Genome

UN Sustainable Development Goals

Life below water

No related works found for this paper.

Funding

M
Max-Planck-Gesellschaft