articleNature MethodsJun 24, 2019HYBRID OA

Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold

Seoul National University · Johns Hopkins University · +1 more institution

PubMed
Indexed incrossrefpubmed

Abstract

The open-source de novo protein-level assembler, Plass ( https://plass.mmseqs.com ), assembles six-frame-translated sequencing reads into protein sequences. It recovers 2–10 times more protein sequences from complex metagenomes and can assemble huge datasets. We assembled two redundancy-filtered reference protein catalogs, 2 billion sequences from 640 soil samples (soil reference protein catalog) and 292 million sequences from 775 marine eukaryotic metatranscriptomes (marine eukaryotic reference catalog), the largest free collections of protein sequences. The protein-level assembler can assemble protein catalogs from raw metagenomic sequencing data, enabling large-scale metagenomics studies.

Citation impact

480
total citations
FWCI
14.96
Percentile
100%
References
28
Citations per year

Authors

3

Topics & keywords

Keywords
  • Metagenomics
  • Computational biology
  • Biology
  • Protein sequencing
  • Sequence assembly
  • Redundancy (engineering)
  • Genetics
  • Gene
UN Sustainable Development Goals
  • Life below water
No related works found for this paper.

Funding