Performance of neural network basecalling tools for Oxford Nanopore sequencing

Wick, Ryan R.; Judd, Louise M.; Holt, Kathryn E.

doi:10.1186/s13059-019-1727-y

articleGenome biologyJun 24, 2019GOLD OA

Performance of neural network basecalling tools for Oxford Nanopore sequencing

RRRyan R. Wick LMLouise M. Judd KEKathryn E. Holt

Monash University · London School of Hygiene & Tropical Medicine

PubMed

Indexed incrossrefdoajpubmed

Abstract

Background

Basecalling, the computational process of translating raw electrical signal to nucleotide sequence, is of critical importance to the sequencing platforms produced by Oxford Nanopore Technologies (ONT). Here, we examine the performance of different basecalling tools, looking at accuracy at the level of bases within individual reads and at majority-rule consensus basecalls in an assembly. We also investigate some additional aspects of basecalling: training using a taxon-specific dataset, using a larger neural network model and improving consensus basecalls in an assembly by additional signal-level analysis with Nanopolish.

Results

Training basecallers on taxon-specific data results in a significant boost in consensus accuracy, mostly due to the reduction of errors in methylation motifs. A larger neural network is able to improve both read and consensus accuracy, but at a cost to speed. Improving consensus sequences ('polishing') with Nanopolish somewhat negates the accuracy differences in basecallers, but pre-polish accuracy does have an effect on post-polish accuracy.

Citation impact

3,258

total citations

FWCI: 122.61
Percentile: 100%
References: 32

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Biology
Nanopore sequencing
Genome Biology
Human genetics
Computational biology
Artificial neural network
DNA sequencing
Nanopore

No related works found for this paper.