preprintarXiv (Cornell University)Oct 20, 2022GREEN OA

Scaling Instruction-Finetuned Language Models

Indexed inarxivdatacite

Abstract

Finetuning language models on a collection of datasets phrased as instructions has been shown to improve model performance and generalization to unseen tasks. In this paper we explore instruction finetuning with a particular focus on (1) scaling the number of tasks, (2) scaling the model size, and (3) finetuning on chain-of-thought data. We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation). For instance, Flan-PaLM 540B instruction-finetuned on 1.8K tasks outperforms PALM 540B by a large margin (+9.4% on…

Citation impact

1,187
total citations
FWCI
Percentile
References
0
Citations per year

Authors

35

Topics & keywords

Keywords
  • Computer science
  • Margin (machine learning)
  • Usability
  • Variety (cybernetics)
  • Scaling
  • Language model
  • Artificial intelligence
  • Parallel computing
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.