Sequence-specific error profile of Illumina sequencers
Nara Institute of Science and Technology
Abstract
We identified the sequence-specific starting positions of consecutive miscalls in the mapping of reads obtained from the Illumina Genome Analyser (GA). Detailed analysis of the miscall pattern indicated that the underlying mechanism involves sequence-specific interference of the base elongation process during sequencing. The two major sequence patterns that trigger this sequence-specific error (SSE) are: (i) inverted repeats and (ii) GGC sequences. We speculate that these sequences favor dephasing by inhibiting single-base elongation, by: (i) folding single-stranded DNA and (ii) altering enzyme preference. This phenomenon is a major cause of sequence coverage variability and of the unfavorable bias observed…
Citation impact
- FWCI
- 26.87
- Percentile
- 100%
- References
- 40
Authors
13Topics & keywords
- Biology
- Genetics
- Computational biology
- Sequence (biology)
- DNA sequencing
- Sequence assembly
- DNA
- Gene