Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation

University of the Basque Country

PubMed
Indexed incrossrefpubmed

Abstract

In the machine learning field, the performance of a classifier is usually measured in terms of prediction error. In most real-world problems, the error cannot be exactly calculated and it must be estimated. Therefore, it is important to choose an appropriate estimator of the error. This paper analyzes the statistical properties, bias and variance, of the kappa-fold cross-validation classification error estimator (kappa-cv). Our main contribution is a novel theoretical decomposition of the variance of the kappa-cv considering its sources of variance: sensitivity to changes in the training set and sensitivity to changes in the folds. The paper also compares the bias and variance of the estimator for different…

Citation impact

1,865
total citations
FWCI
15.85
Percentile
100%
References
42
Citations per year

Authors

3

Topics & keywords

Keywords
  • Estimator
  • Bayes error rate
  • Cross-validation
  • Kappa
  • Naive Bayes classifier
  • Statistics
  • Classifier (UML)
  • Pattern recognition (psychology)
No related works found for this paper.