A unified catalog of 204,938 reference genomes from the human gut microbiome
European Bioinformatics Institute · Wellcome Sanger Institute · +9 more institutions
Abstract
Abstract Comprehensive, high-quality reference genomes are required for functional characterization and taxonomic assignment of the human gut microbiota. We present the Unified Human Gastrointestinal Genome (UHGG) collection, comprising 204,938 nonredundant genomes from 4,644 gut prokaryotes. These genomes encode >170 million protein sequences, which we collated in the Unified Human Gastrointestinal Protein (UHGP) catalog. The UHGP more than doubles the number of gut proteins in comparison to those present in the Integrated Gene Catalog. More than 70% of the UHGG species lack cultured representatives, and 40% of the UHGP lack functional annotations. Intraspecies genomic variation analyses revealed a large…
Citation impact
- FWCI
- 54.85
- Percentile
- 100%
- References
- 80
Authors
13- AAAlexandre AlmeidaCorresponding
European Bioinformatics Institute, Wellcome Sanger Institute
- SNStephen Nayfach
Lawrence Berkeley National Laboratory, Joint Genome Institute
- MBMiguel Boland
European Bioinformatics Institute
- FSFrancesco Strozzi
Enterome (France)
- MBMartín Beracochea
European Bioinformatics Institute
Topics & keywords
- Biology
- Genome
- ENCODE
- Computational biology
- Gene
- Microbiome
- Human genome
- Genetics
Funding
- UDU.S. Department of EnergyAwards: -AC02-05CH11231, 05CH11231, AC02-05CH11231, DE-AC02, DE-AC02-05CH11231, DE-AC02-
- EMEuropean Molecular Biology Laboratory
- JGJoint Genome InstituteAwards: DE-AC02-05CH11231, AC02-05CH11231
- NENational Energy Research Scientific Computing CenterAwards: 05CH11231, AC02-05CH11231
- DFDirectorate for Biological Sciences
- OOOffice of ScienceAwards: AC02-05CH11231, -AC02-05CH11231, DE-AC02
- BABiotechnology and Biological Sciences Research CouncilAwards: BB/R015228/1, BB/N018354/1