articleInformationJan 16, 2023GOLD OA

A Comparison of Undersampling, Oversampling, and SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining

University of Alberta

Indexed incrossrefdoaj

Abstract

Educational data mining is capable of producing useful data-driven applications (e.g., early warning systems in schools or the prediction of students’ academic achievement) based on predictive models. However, the class imbalance problem in educational datasets could hamper the accuracy of predictive models as many of these models are designed on the assumption that the predicted class is balanced. Although previous studies proposed several methods to deal with the imbalanced class problem, most of them focused on the technical details of how to improve each technique, while only a few focused on the application aspect, especially for the application of data with different imbalance ratios. In this study, we…

Citation impact

331
total citations
FWCI
54.81
Percentile
100%
References
47
Citations per year

Authors

3

Topics & keywords

Keywords
  • Undersampling
  • Oversampling
  • Resampling
  • Random forest
  • Computer science
  • Machine learning
  • Class (philosophy)
  • Data mining
No related works found for this paper.