articleExpert Systems with ApplicationsSep 10, 2011HYBRID OA

An experimental comparison of classification algorithms for imbalanced credit scoring data sets

University of Southampton

Indexed incrossref

Abstract

In this paper, we set out to compare several techniques that can be used in the analysis of imbalanced credit scoring data sets. In a credit scoring context, imbalanced data sets frequently occur as the number of defaulting loans in a portfolio is usually much lower than the number of observations that do not default. As well as using traditional classification techniques such as logistic regression, neural networks and decision trees, this paper will also explore the suitability of gradient boosting, least square support vector machines and random forests for loan default prediction. Five real-world credit scoring data sets are used to build classifiers and test their performance. In our experiments, we…

Citation impact

715
total citations
FWCI
21.93
Percentile
100%
References
38
Citations per year

Authors

2

Topics & keywords

Keywords
  • Machine learning
  • Artificial intelligence
  • Decision tree
  • Computer science
  • Support vector machine
  • Boosting (machine learning)
  • Random forest
  • Gradient boosting
UN Sustainable Development Goals
  • Reduced inequalities
No related works found for this paper.

Funding