articleScientific ReportsJan 23, 2026GOLD OA

The impact of K selection in K‑fold cross-validation on bias and variance in supervised learning models

The University of Sydney · Chengdu Institute of Information Technology (China)

PubMed
Indexed incrossrefdoajpubmed

Abstract

K-fold cross-validation is a widely used technique for estimating the generalisation of the performance of supervised machine learning models. However, the effect of the number of folds (k) on bias-variance behaviour across models and datasets is not fully understood. This study examines how varying k, from 3 to 20, relates to estimates of bias and variance across four classification algorithms, evaluated on twelve datasets of varying sizes. These four algorithms are Support Vector Machine (SVM), Decision Tree (DT), Logistic Regression (LR), and k-Nearest Neighbours (KNN). We operationalise bias as the difference between the mean cross-validated training accuracy and the held-out test accuracy, and variance as…

Citation impact

5
total citations
FWCI
116.53
Percentile
100%
References
36
Too recent for citation history.

Authors

3

Topics & keywords

Keywords
  • Variance (accounting)
  • Support vector machine
  • Preprocessor
  • Random forest
  • Replication (statistics)
  • Logistic regression
  • Selection (genetic algorithm)
  • Feature selection
No related works found for this paper.