reviewACM Computing SurveysAug 13, 2016GREEN OA

A Survey of Predictive Modeling on Imbalanced Domains

Universidade do Porto · INESC TEC

Indexed incrossref

Abstract

Many real-world data-mining applications involve obtaining predictive models using datasets with strongly imbalanced distributions of the target variable. Frequently, the least-common values of this target variable are associated with events that are highly relevant for end users (e.g., fraud detection, unusual returns on stock markets, anticipation of catastrophes, etc.). Moreover, the events may have different costs and benefits, which, when associated with the rarity of some of them on the available training data, creates serious problems to predictive modeling techniques. This article presents a survey of existing techniques for handling these important applications of predictive analytics. Although most…

Citation impact

1,077
total citations
FWCI
75.53
Percentile
100%
References
269
Citations per year

Authors

3

Topics & keywords

Keywords
  • Computer science
  • Predictive analytics
  • Machine learning
  • Variable (mathematics)
  • Anticipation (artificial intelligence)
  • Artificial intelligence
  • Data mining
  • Data science
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.

Funding