A Survey on Mixture of Experts in Large Language Models

Cai, Weilin; Jiang, Juyong; Wang, Fan; Tang, Jing; Kim, Sung Hun; Huang, Jiayi

doi:10.1109/tkde.2025.3554028

articleIEEE Transactions on Knowledge and Data EngineeringJan 1, 2025Closed access

A Survey on Mixture of Experts in Large Language Models

WCWeilin Cai JJJuyong Jiang FWFan Wang JTJing Tang SHSung Hun Kim

Indexed incrossref

Abstract

Large language models (LLMs) have garnered unprecedented advancements across diverse fields, ranging from natural language processing to computer vision and beyond. The prowess of LLMs is underpinned by their substantial model size, extensive and diverse datasets, and the vast computational power harnessed during training, all of which contribute to the emergent abilities of LLMs (e.g., in-context learning) that are not present in small models. Within this context, the mixture of experts (MoE) has emerged as an effective method for substantially scaling up model capacity with minimal computation overhead, gaining significant attention from academia and industry. Despite its growing prevalence, there lacks a…

Citation impact

83

total citations

FWCI: 209.73
Percentile: 100%
References: 169

Citations per year

Authors

6

Topics & keywords

Topics

Expert finding and Q&A systems96%

Keywords

Computer science
Natural language processing
Data modeling
Information retrieval
Artificial intelligence
Data science
Database

UN Sustainable Development Goals

Quality Education

No related works found for this paper.