DriveGPT4: Interpretable End-to-End Autonomous Driving Via Large Language Model

Xu, Zhenhua; Zhang, Yujia; Xie, Enze; Zhen, Zhao; Guo, Yong; Wong, Kenneth K.; Li, Zhenguo; Zhao, Hengshuang

doi:10.1109/lra.2024.3440097

articleIEEE Robotics and Automation LettersAug 7, 2024GREEN OA

DriveGPT4: Interpretable End-to-End Autonomous Driving Via Large Language Model

ZXZhenhua XuYZYujia ZhangEXEnze Xie ZZZhao Zhen YGYong Guo

University of Hong Kong · Zhejiang University · +2 more institutions

Indexed incrossref

Abstract

Multimodallarge language models (MLLMs) have emerged as a prominent area of interest within the research community, given their proficiency in handling and reasoning with non-textual data, including images and videos. This study seeks to extend the application of MLLMs to the realm of autonomous driving by introducing DriveGPT4, a novel interpretable end-to-end autonomous driving system based on LLMs. Capable of processing multi-frame video inputs and textual queries, DriveGPT4 facilitates the interpretation of vehicle actions, offers pertinent reasoning, and effectively addresses a diverse range of questions posed by users. Furthermore, DriveGPT4 predicts low-level vehicle control signals in an end-to-end…

Citation impact

311

total citations

FWCI: 70.29
Percentile: 100%
References: 68

Citations per year

Authors

8

ZX
Zhenhua XuCorresponding
University of Hong Kong
YZ
Yujia Zhang
Zhejiang University
EX
Enze Xie
Huawei Technologies (Canada)
ZZ
Zhao Zhen
The University of Sydney
YG
Yong Guo
Huawei Technologies (Canada)

Topics & keywords

Topics

Keywords

End-to-end principle
Computer science
Language model
End of history
Artificial intelligence
Political science

No related works found for this paper.

Funding

NN
National Natural Science Foundation of China
Award: 62201484