articleBMC BioinformaticsApr 17, 2006GOLD OA

Length-dependent prediction of protein intrinsic disorder

Temple University · Indiana University Bloomington · +2 more institutions

PubMed
Indexed incrossrefdoajpubmed

Abstract

Background

Due to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure Prediction (CASP6). Since the initial work by Romero et al. (Identifying disordered regions in proteins from amino acid sequences, IEEE Int. Conf. Neural Netw., 1997), our group has developed several predictors optimized for long disordered regions (>30 residues) with prediction accuracy exceeding 85%. However, these predictors are less successful on short disordered regions (30 residues), respectively. A meta predictor was then trained to integrate the specialized predictors into the final predictor model. As the 10-fold cross-validation results showed, the VSL2 predictors achieved well-balanced prediction accuracies of 81% on both short and long disordered regions. Comparisons over the VSL2 training dataset via 10-fold cross-validation and a blind-test set of unrelated recent PDB chains indicated that VSL2 predictors were significantly more accurate than several existing predictors of intrinsic protein disorder.

Conclusion

The VSL2 predictors are applicable to disordered regions of any length and can accurately identify the short disordered regions that are often misclassified by our previous disorder predictors. The success of the VSL2 predictors further confirmed the previously observed differences in amino acid compositions and sequence properties between short and long disordered regions, and justified our approaches for modelling short and long disordered regions separately. The VSL2 predictors are freely accessible for non-commercial use at http://www.ist.temple.edu/disprot/predictorVSL2.php.

Citation impact

951
total citations
FWCI
12.54
Percentile
100%
References
81
Citations per year

Authors

5

Topics & keywords

Keywords
  • Intrinsically disordered proteins
  • Protein structure prediction
  • Predictive modelling
  • Computational biology
  • Protein Data Bank (RCSB PDB)
  • Protein structure
  • Biology
  • Computer science
No related works found for this paper.

Funding