Integrating sequencing datasets to form highly confident SNP and indel genotype calls for a whole human genome
Indexed indatacite
Abstract
Clinical adoption of human genome sequencing requires methods with known accuracy of genotype calls at millions or billions of positions across a genome. Previous work showing discordance amongst sequencing methods and algorithms has made clear the need for a highly accurate set of genotypes across a whole genome that could be used as a benchmark. We present methods to make highly confident SNP, indel, and homozygous reference genotype calls for NA12878, the pilot genome for the Genome in a Bottle Consortium. We minimize bias towards any method by integrating and arbitrating between 14 datasets from 5 sequencing technologies, 7 mappers, and 3 variant callers. Regions for which no confident genotype call could…
Citation impact
552
total citations
- FWCI
- —
- Percentile
- —
- References
- 0
Citations per year
Authors
7Topics & keywords
Topics
Keywords
- Indel
- Benchmarking
- Genome
- Genotype
- Human genome
- Computational biology
- Benchmark (surveying)
- Genomics
UN Sustainable Development Goals
- Partnerships for the goals
No related works found for this paper.