articleBioData MiningJan 7, 2015GOLD OA

Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks

University College London

PubMed
Indexed incrossrefdoajpubmed

Abstract

Background

Genetic studies are increasingly based on short noisy next generation scanners. Typically complete DNA sequences are assembled by matching short NextGen sequences against reference genomes. Despite considerable algorithmic gains since the turn of the millennium, matching both single ended and paired end strings to a reference remains computationally demanding. Further tailoring Bioinformatics tools to each new task or scanner remains highly skilled and labour intensive. With this in mind, we recently demonstrated a genetic programming based automated technique which generated a version of the state-of-the-art alignment tool Bowtie2 which was considerably faster on short sequences produced by a scanner at the Broad Institute and released as part of The Thousand Genome Project.

Results

Bowtie2 (G P) and the original Bowtie2 release were compared on bioplanet's GCAT synthetic benchmarks. Bowtie2 (G P) enhancements were also applied to the latest Bowtie2 release (2.2.3, 29 May 2014) and retained both the GP and the manually introduced improvements.

Citation impact

696
total citations
FWCI
12.50
Percentile
100%
References
30
Citations per year

Authors

1

Topics & keywords

Keywords
  • Genetic programming
  • Computer science
  • Matching (statistics)
  • Task (project management)
  • Scanner
  • Sequence (biology)
  • Genome
  • Dynamic programming
UN Sustainable Development Goals
  • Decent work and economic growth
No related works found for this paper.

Funding