A Comparison of Undersampling, Oversampling, and SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining
Indexed incrossrefdoaj
Abstract
Educational data mining is capable of producing useful data-driven applications (e.g., early warning systems in schools or the prediction of students’ academic achievement) based on predictive models. However, the class imbalance problem in educational datasets could hamper the accuracy of predictive models as many of these models are designed on the assumption that the predicted class is balanced. Although previous studies proposed several methods to deal with the imbalanced class problem, most of them focused on the technical details of how to improve each technique, while only a few focused on the application aspect, especially for the application of data with different imbalance ratios. In this study, we…
Citation impact
331
total citations
- FWCI
- 54.81
- Percentile
- 100%
- References
- 47
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Undersampling
- Oversampling
- Resampling
- Random forest
- Computer science
- Machine learning
- Class (philosophy)
- Data mining
No related works found for this paper.