High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios
New York Genome Center · Broad Institute · +7 more institutions
Abstract
The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in…
Citation impact
- FWCI
- 80.21
- Percentile
- 100%
- References
- 83
Authors
42Topics & keywords
- Indel
- Biology
- 1000 Genomes Project
- Imputation (statistics)
- Genome
- Whole genome sequencing
- Computational biology
- Genetics
- Partnerships for the goals
Funding
- WTWellcome TrustAward: WT104947/Z/14/Z
- EMEuropean Molecular Biology Laboratory
- NINational Institutes of HealthAward: U24HG007497
- NINational Institute of Mental HealthAward: MH115957
- NHNational Human Genome Research InstituteAwards: UM1HG008901, UM1HG008853
- UNU.S. National Library of Medicine
- EKEunice Kennedy Shriver National Institute of Child Health and Human DevelopmentAwards: HD081256, UM1HG008895