An experimental comparison of classification algorithms for imbalanced credit scoring data sets
Indexed incrossref
Abstract
In this paper, we set out to compare several techniques that can be used in the analysis of imbalanced credit scoring data sets. In a credit scoring context, imbalanced data sets frequently occur as the number of defaulting loans in a portfolio is usually much lower than the number of observations that do not default. As well as using traditional classification techniques such as logistic regression, neural networks and decision trees, this paper will also explore the suitability of gradient boosting, least square support vector machines and random forests for loan default prediction. Five real-world credit scoring data sets are used to build classifiers and test their performance. In our experiments, we…
Citation impact
715
total citations
- FWCI
- 21.93
- Percentile
- 100%
- References
- 38
Citations per year
Authors
2Topics & keywords
Topics
Keywords
- Machine learning
- Artificial intelligence
- Decision tree
- Computer science
- Support vector machine
- Boosting (machine learning)
- Random forest
- Gradient boosting
UN Sustainable Development Goals
- Reduced inequalities
No related works found for this paper.