Informed and automated k -mer size selection for genome assembly
Indexed incrossrefdoajpubmed
Abstract
MOTIVATION: Genome assembly tools based on the de Bruijn graph framework rely on a parameter k, which represents a trade-off between several competing effects that are difficult to quantify. There is currently a lack of tools that would automatically estimate the best k to use and/or quickly generate histograms of k-mer abundances that would allow the user to make an informed decision. RESULTS: We develop a fast and accurate sampling method that constructs approximate abundance histograms with several orders of magnitude performance improvement over traditional methods. We then present a fast heuristic that uses the generated abundance histograms for putative k values to estimate the best possible value of k.…
Citation impact
783
total citations
- FWCI
- 28.19
- Percentile
- 100%
- References
- 63
Citations per year
Authors
2Topics & keywords
Topics
Keywords
- Computer science
- Sequence assembly
- Histogram
- Heuristic
- De Bruijn graph
- k-mer
- De Bruijn sequence
- Graph
No related works found for this paper.