A Survey on Mixture of Experts in Large Language Models
Indexed incrossref
Abstract
Large language models (LLMs) have garnered unprecedented advancements across diverse fields, ranging from natural language processing to computer vision and beyond. The prowess of LLMs is underpinned by their substantial model size, extensive and diverse datasets, and the vast computational power harnessed during training, all of which contribute to the emergent abilities of LLMs (e.g., in-context learning) that are not present in small models. Within this context, the mixture of experts (MoE) has emerged as an effective method for substantially scaling up model capacity with minimal computation overhead, gaining significant attention from academia and industry. Despite its growing prevalence, there lacks a…
Citation impact
83
total citations
- FWCI
- 209.73
- Percentile
- 100%
- References
- 169
Citations per year
Authors
6Topics & keywords
Keywords
- Computer science
- Natural language processing
- Data modeling
- Information retrieval
- Artificial intelligence
- Data science
- Database
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.