A comprehensive survey on imbalanced data learning
The University of Queensland · Yunnan University · +3 more institutions
Abstract
Abstract With the expansion of data availability, machine learning (ML) has achieved remarkable breakthroughs in both academia and industry. However, imbalanced data distributions are prevalent in various types of raw data and severely hinder the performance of ML by biasing the decision-making processes. To deepen the understanding of imbalanced data and facilitate the related research and applications, this survey systematically analyzes various real-world data formats and concludes existing researches for different data formats into four distinct categories: data re-balancing, feature representation, training strategy, and ensemble learning. This structured analysis helps researchers comprehensively…
Citation impact
- FWCI
- 121.75
- Percentile
- 100%
- References
- 112
Authors
8Topics & keywords
- Raw data
- Feature (linguistics)
- Data exploration
- Path (computing)
- Data type
- Ensemble learning