A Survey of Large Language Models for Arabic Language and its Dialects

King Saud University

Indexed inarxivcrossrefdatacite

Abstract

This survey presents a comprehensive review of Large Language Models (LLMs) developed for the Arabic language and its dialects. It categorizes models by architecture (encoder-only, decoder-only, and encoder-decoder) and by linguistic form, including Classical Arabic, Modern Standard Arabic, and Dialectal Arabic. We analyze monolingual, bilingual, and multilingual models, evaluating their performance on tasks such as sentiment analysis, named entity recognition, and question answering. The survey also assesses model openness, considering factors like access to source code, training data, weights, and documentation. Our findings highlight a concentration of resources on MSA, a lack of diverse dialectal datasets,…

Citation impact

5
total citations
FWCI
48.05
Percentile
99%
References
49
Citations per year

Authors

3

Topics & keywords

Keywords
  • Arabic
  • Linguistics
  • Computer science
  • Natural language processing
  • Philosophy
No related works found for this paper.