articleACM SIGKDD Explorations NewsletterJun 1, 2004Closed access

A study of the behavior of several methods for balancing machine learning training data

Brazilian Society of Computational and Applied Mathematics

Indexed incrossref

Abstract

There are several aspects that might influence the performance achieved by existing learning systems. It has been reported that one of these aspects is related to class imbalance in which examples in training data belonging to one class heavily outnumber the examples in the other class. In this situation, which is found in real world data describing an infrequent but important event, the learning system may have difficulties to learn the concept related to the minority class. In this work we perform a broad experimental evaluation involving ten methods, three of them proposed by the authors, to deal with the class imbalance problem in thirteen UCI data sets. Our experiments provide evidence that class…

Citation impact

4,101
total citations
FWCI
37.46
Percentile
100%
References
26
Citations per year

Authors

3

Topics & keywords

Keywords
  • Computer science
  • Class (philosophy)
  • Machine learning
  • Artificial intelligence
  • Sampling (signal processing)
  • Simple random sample
  • Event (particle physics)
  • Data mining
No related works found for this paper.