preprintarXiv (Cornell University)Apr 5, 2022GREEN OA

PaLM: Scaling Language Modeling with Pathways

Indexed inarxivdatacite

Abstract

Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of…

Citation impact

2,129
total citations
FWCI
Percentile
References
0
Citations per year

Authors

67

Topics & keywords

Keywords
  • Computer science
  • Language model
  • Artificial intelligence
  • Benchmark (surveying)
  • Scaling
  • Task (project management)
  • Machine learning
  • Variety (cybernetics)
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.