A Survey on Data Augmentation for Text Classification
Technische Universität Darmstadt
Abstract
Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization capabilities, it can also address many other challenges and problems, from overcoming a limited amount of training data to regularizing the objective, to limiting the amount of data used to protect privacy. Based on a precise description of the goals and applications of data augmentation and a taxonomy for existing works, this survey is concerned with data augmentation methods for textual classification and aims at providing a concise and comprehensive overview for researchers and…
Citation impact
- FWCI
- 51.33
- Percentile
- 100%
- References
- 203
Authors
3Topics & keywords
- Computer science
- Taxonomy (biology)
- Generalization
- Limiting
- Field (mathematics)
- Data science
- Training set
- Artificial intelligence
- Quality Education