reviewACM Computing SurveysJun 17, 2022GREEN OA

A Survey on Data Augmentation for Text Classification

Technische Universität Darmstadt

Indexed inarxivcrossrefdatacite

Abstract

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization capabilities, it can also address many other challenges and problems, from overcoming a limited amount of training data to regularizing the objective, to limiting the amount of data used to protect privacy. Based on a precise description of the goals and applications of data augmentation and a taxonomy for existing works, this survey is concerned with data augmentation methods for textual classification and aims at providing a concise and comprehensive overview for researchers and…

Citation impact

396
total citations
FWCI
51.33
Percentile
100%
References
203
Citations per year

Authors

3

Topics & keywords

Keywords
  • Computer science
  • Taxonomy (biology)
  • Generalization
  • Limiting
  • Field (mathematics)
  • Data science
  • Training set
  • Artificial intelligence
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.