Class-Balanced Loss Based on Effective Number of Samples
Cornell University · Google (United States)
Abstract
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the…
Citation impact
- FWCI
- 96.45
- Percentile
- 100%
- References
- 82
Authors
5Topics & keywords
- Hyperparameter
- Weighting
- Class (philosophy)
- Computer science
- Point (geometry)
- Scale (ratio)
- Sampling (signal processing)
- Algorithm