A simple method to control over-alignment in the MAFFT multiple sequence alignment program
The University of Osaka · Kyoto University
Abstract
MOTIVATION: We present a new feature of the MAFFT multiple alignment program for suppressing over-alignment (aligning unrelated segments). Conventional MAFFT is highly sensitive in aligning conserved regions in remote homologs, but the risk of over-alignment is recently becoming greater, as low-quality or noisy sequences are increasing in protein sequence databases, due, for example, to sequencing errors and difficulty in gene prediction. RESULTS: The proposed method utilizes a variable scoring matrix for different pairs of sequences (or groups) in a single multiple sequence alignment, based on the global similarity of each pair. This method significantly increases the correctly gapped sites in real examples…
Citation impact
- FWCI
- 25.07
- Percentile
- 100%
- References
- 56
Authors
2Topics & keywords
- Multiple sequence alignment
- Computer science
- Sequence alignment
- Data mining
- Software
- Dynamic programming
- Sequence (biology)
- Similarity (geometry)