A Survey on Multimodal Large Language Models for Autonomous Driving
Purdue University West Lafayette · University of Illinois Urbana-Champaign · +3 more institutions
Abstract
With the emergence of Large Language Models (LLMs) and Vision Foundation Models (VFMs), multimodal AI systems benefiting from large models have the potential to equally perceive the real world, make decisions, and control tools as humans. In recent months, LLMs have shown widespread attention in autonomous driving and map systems. Despite its immense potential, there is still a lack of a comprehensive understanding of key challenges, opportunities, and future endeavors to apply in LLM driving systems. In this paper, we present a systematic investigation in this field. We first introduce the background of Multimodal Large Language Models (MLLMs), the multimodal models development using LLMs, and the history of…
Citation impact
- FWCI
- 60.58
- Percentile
- 100%
- References
- 223
Authors
21Topics & keywords
- Computer science
- Human–computer interaction