MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm
Joint BioEnergy Institute · Lawrence Berkeley National Laboratory · +3 more institutions
Abstract
Recovering individual genomes from metagenomic datasets allows access to uncultivated microbial populations that may have important roles in natural and engineered ecosystems. Understanding the roles of these uncultivated populations has broad application in ecology, evolution, biotechnology and medicine. Accurate binning of assembled metagenomic sequences is an essential step in recovering the genomes and understanding microbial functions.
We have developed a binning algorithm, MaxBin, which automates the binning of assembled metagenomic scaffolds using an expectation-maximization algorithm after the assembly of metagenomic sequencing reads. Binning of simulated metagenomic datasets demonstrated that MaxBin had high levels of accuracy in binning microbial genomes. MaxBin was used to recover genomes from metagenomic data obtained through the Human Microbiome Project, which demonstrated its ability to recover genomes from real metagenomic datasets with variable sequencing coverages. Application of MaxBin to metagenomes obtained from microbial consortia adapted to grow on cellulose allowed genomic analysis of new, uncultivated, cellulolytic bacterial populations, including an abundant myxobacterial population distantly related to Sorangium cellulosum that possessed a much smaller genome (5 MB versus 13 to 14 MB) but has a more extensive set of genes for biomass deconstruction. For the cellulolytic consortia, the MaxBin results were compared to binning using emergent self-organizing maps (ESOMs) and differential coverage binning, demonstrating that it performed comparably to these methods but had distinct advantages in automation, resolution of related genomes and sensitivity.
Citation impact
- FWCI
- 17.64
- Percentile
- 100%
- References
- 61
Authors
5- YWYu‐Wei WuCorresponding
Joint BioEnergy Institute, Lawrence Berkeley National Laboratory
- YTYung-Hsu Tang
City College of San Francisco, Joint BioEnergy Institute
- SGSusannah G. Tringe
Joint Genome Institute, Lawrence Berkeley National Laboratory
- BABlake A. Simmons
Sandia National Laboratories California, Joint BioEnergy Institute
- SWSteven W. Singer
Lawrence Berkeley National Laboratory, Joint BioEnergy Institute
Topics & keywords
- Metagenomics
- Genome
- Biology
- Human Microbiome Project
- Computational biology
- Microbiome
- Genomics
- Sequence assembly
- Life in Land
Funding
- UDU.S. Department of EnergyAwards: -AC02-05CH11231, Contract No. DE-AC02-05CH11231, 05CH11231, No. DE-AC02-05CH11231, AC02-05CH11231, DE-AC02, DE-AC02-05CH11231, DE-AC02-
- JGJoint Genome InstituteAwards: DE-AC02-05CH11231, AC02-05CH11231
- OOOffice of ScienceAwards: AC02-05CH11231, -AC02-05CH11231, DE-AC02, No. DE-AC02-05CH11231, Contract No. DE-AC02-05CH11231
- BABiological and Environmental ResearchAwards: 05CH11231, No. DE-AC02-05CH11231, Contract No. DE-AC02-05CH11231, DE-AC02-05CH11231, AC02-05CH11231
- LBLawrence Berkeley National LaboratoryAwards: DE-AC02-05CH11231, No. DE-AC02-05CH11231, Contract No. DE-AC02-05CH11231, 05CH11231, AC02-05CH11231