The Curious Case of Neural Text Degeneration
Abstract
Despite considerable advances in neural language modeling, it remains an open question what the best decoding strategy is for text generation from a language model (e.g. to generate a story). The counter-intuitive empirical observation is that even though the use of likelihood as training objective leads to high quality models for a broad range of language understanding tasks, maximization-based decoding methods such as beam search lead to degeneration — output text that is bland, incoherent, or gets stuck in repetitive loops. To address this we propose Nucleus Sampling, a simple but effective method to draw considerably higher quality text out of neural language models. Our approach avoids text degeneration…
Citation impact
528
total citations
- FWCI
- 68.49
- Percentile
- 100%
- References
- 28
Citations per year
Authors
5Topics & keywords
Topics
Keywords
- Decoding methods
- Computer science
- Language model
- Maximization
- Artificial intelligence
- Sampling (signal processing)
- Importance sampling
- Algorithm
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.