Language Models are Few-Shot Learners
Indexed inarxivdatacite
Abstract
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train…
Citation impact
3,029
total citations
- FWCI
- —
- Percentile
- —
- References
- 127
Citations per year
Authors
31Topics & keywords
Topics
Keywords
- Computer science
- Task (project management)
- Language model
- Natural language processing
- Sentence
- Artificial intelligence
- Word (group theory)
- Simple (philosophy)
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.