SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments
Wellcome Sanger Institute · University of Brighton · +2 more institutions
Abstract
Rapidly decreasing genome sequencing costs have led to a proportionate increase in the number of samples used in prokaryotic population studies. Extracting single nucleotide polymorphisms (SNPs) from a large whole genome alignment is now a routine task, but existing tools have failed to scale efficiently with the increased size of studies. These tools are slow, memory inefficient and are installed through non-standard procedures. We present SNP-sites which can rapidly extract SNPs from a multi-FASTA alignment using modest resources and can output results in multiple formats for downstream analysis. SNPs can be extracted from a 8.3 GB alignment file (1842 taxa, 22 618 sites) in 267 seconds using 59 MB of RAM…
Citation impact
- FWCI
- 27.50
- Percentile
- 100%
- References
- 21
Authors
7Topics & keywords
- SNP
- Single-nucleotide polymorphism
- Computer science
- Computational biology
- SNP genotyping
- Genome
- Population
- Biology