Unsupervised Data Augmentation for Consistency Training
Indexed inarxivdatacite
Abstract
Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning. By substituting simple noising operations with advanced data augmentation methods such as RandAugment and back-translation, our method brings substantial improvements across six language and…
Citation impact
1,621
total citations
- FWCI
- —
- Percentile
- —
- References
- 74
Citations per year
Authors
5Topics & keywords
Topics
Keywords
- Computer science
- Labeled data
- Consistency (knowledge bases)
- Artificial intelligence
- Machine learning
- Benchmark (surveying)
- Code (set theory)
- Noisy data
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.