MUSCLE: multiple sequence alignment with high accuracy and high throughput
Indexed incrossrefdoajpubmed
Abstract
We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest…
Citation impact
46,334
total citations
- FWCI
- 87.26
- Percentile
- 100%
- References
- 46
Citations per year
Authors
1Topics & keywords
Topics
Keywords
- Benchmark (surveying)
- Multiple sequence alignment
- Biology
- Computer science
- Rank (graph theory)
- Sequence alignment
- Source code
- Tree (set theory)
No related works found for this paper.