articleFrontiers of Computer ScienceMar 7, 2026HYBRID OA

A comprehensive survey on imbalanced data learning

The University of Queensland · Yunnan University · +3 more institutions

Indexed incrossref

Abstract

Abstract With the expansion of data availability, machine learning (ML) has achieved remarkable breakthroughs in both academia and industry. However, imbalanced data distributions are prevalent in various types of raw data and severely hinder the performance of ML by biasing the decision-making processes. To deepen the understanding of imbalanced data and facilitate the related research and applications, this survey systematically analyzes various real-world data formats and concludes existing researches for different data formats into four distinct categories: data re-balancing, feature representation, training strategy, and ensemble learning. This structured analysis helps researchers comprehensively…

Citation impact

6
total citations
FWCI
121.75
Percentile
100%
References
112
Citations per year

Authors

8

Topics & keywords

Keywords
  • Raw data
  • Feature (linguistics)
  • Data exploration
  • Path (computing)
  • Data type
  • Ensemble learning
No related works found for this paper.