articlearXiv (Cornell University)Nov 9, 2022GREEN OA

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

KEKrahmer, EmielCFClouth, FelixTLTeven Le ScaoVRVromans, RubenPSPauws, Steffen
Indexed inarxivdatacite

Abstract

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive…

Citation impact

284
total citations
FWCI
38.84
Percentile
100%
References
0
Citations per year

Authors

8
  • KE
    Krahmer, EmielCorresponding
  • CF
    Clouth, Felix
  • TL
    Teven Le Scao
  • VR
    Vromans, Ruben
  • PS
    Pauws, Steffen

Topics & keywords

Keywords
  • Computer science
  • License
  • Language model
  • Transformer
  • Bloom
  • Variety (cybernetics)
  • Natural language
  • Resource (disambiguation)
No related works found for this paper.

Funding