Protein database searches using compositionally adjusted substitution matrices
National Institutes of Health · National Center for Biotechnology Information
Abstract
Almost all protein database search methods use amino acid substitution matrices for scoring, optimizing, and assessing the statistical significance of sequence alignments. Much care and effort has therefore gone into constructing substitution matrices, and the quality of search results can depend strongly upon the choice of the proper matrix. A long-standing problem has been the comparison of sequences with biased amino acid compositions, for which standard substitution matrices are not optimal. To address this problem, we have recently developed a general procedure for transforming a standard matrix into one appropriate for the comparison of two sequences with arbitrary, and possibly differing compositions.…
Citation impact
- FWCI
- 5.69
- Percentile
- 100%
- References
- 43
Authors
7- SFStephen F. AltschulCorresponding
National Institutes of Health, National Center for Biotechnology Information
- JCJohn C. Wootton
National Institutes of Health, National Center for Biotechnology Information
- EME. Michael Gertz
National Institutes of Health, National Center for Biotechnology Information
- RARicha Agarwala
National Institutes of Health, National Center for Biotechnology Information
- AMAleksandr Morgulis
National Institutes of Health, National Center for Biotechnology Information
Topics & keywords
- Substitution (logic)
- Sequence (biology)
- Matrix (chemical analysis)
- Computer science
- Data mining
- Database
- Mathematics
- Biology
- Reduced inequalities