REALM: Retrieval-Augmented Language Model Pre-Training
Indexed inarxivdatacite
Abstract
Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning…
Citation impact
515
total citations
- FWCI
- —
- Percentile
- —
- References
- 38
Citations per year
Authors
5Topics & keywords
Topics
Keywords
- Realm
- Training (meteorology)
- Computer science
- Language model
- Natural language processing
- Artificial intelligence
- History
- Geography
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.