A Survey of Large Language Models for Arabic Language and its Dialects

Mashaabi, Malak; Al-Khalifa, Shahad; Al‐Khalifa, Hend S.

doi:10.1145/3807946

preprintACM Transactions on Asian and Low-Resource Language Information ProcessingApr 9, 2026GREEN OA

A Survey of Large Language Models for Arabic Language and its Dialects

MMMalak Mashaabi SAShahad Al-Khalifa HSHend S. Al‐Khalifa

King Saud University

Indexed inarxivcrossrefdatacite

Abstract

This survey presents a comprehensive review of Large Language Models (LLMs) developed for the Arabic language and its dialects. It categorizes models by architecture (encoder-only, decoder-only, and encoder-decoder) and by linguistic form, including Classical Arabic, Modern Standard Arabic, and Dialectal Arabic. We analyze monolingual, bilingual, and multilingual models, evaluating their performance on tasks such as sentiment analysis, named entity recognition, and question answering. The survey also assesses model openness, considering factors like access to source code, training data, weights, and documentation. Our findings highlight a concentration of resources on MSA, a lack of diverse dialectal datasets,…

Citation impact

5

total citations

FWCI: 48.05
Percentile: 99%
References: 49

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Arabic
Linguistics
Computer science
Natural language processing
Philosophy

No related works found for this paper.