LLaMA: Open and Efficient Foundation Language Models

Touvron, Hugo; Lavril, Thibaut; Izacard, Gautier; Martinet, Xavier; Lachaux, Marie-Anne; Lacroix, Timothée; Rozière, Baptiste; Goyal, Naman; Hambro, Eric; Azhar, Faisal; Rodriguez, Aurelien; Joulin, Armand; Grave, Édouard; Lample, Guillaume

doi:10.48550/arxiv.2302.13971

preprintarXiv (Cornell University)Feb 27, 2023GREEN OA

LLaMA: Open and Efficient Foundation Language Models

HTHugo Touvron TLThibaut Lavril GIGautier Izacard XMXavier Martinet MLMarie-Anne Lachaux

Indexed inarxivdatacite

Abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Citation impact

3,888

total citations

FWCI: —
Percentile: —
References: 0

Citations per year

Authors

14

Topics & keywords

Topics

Keywords

Foundation (evidence)
Computer science
Ranging
Language model
State (computer science)
Artificial intelligence
Programming language
Archaeology

No related works found for this paper.