Evaluating Object Hallucination in Large Vision-Language Models

Li, Yifan; Du, Yifan; Zhou, Kun; Wang, Jinpeng; Xin, Zhao; Wen, Ji-Rong

doi:10.18653/v1/2023.emnlp-main.20

articleJan 1, 2023GOLD OA

Evaluating Object Hallucination in Large Vision-Language Models

YLYifan Li YDYifan Du KZKun Zhou JWJinpeng Wang ZXZhao Xin

Beijing Institute of Big Data Research · Renmin University of China

Indexed incrossref

Abstract

Inspired by the superior language abilities of large language models (LLM), large vision-language models (LVLM) have been recently proposed by integrating powerful LLMs for improving the performance on complex multimodal tasks. Despite the promising progress on LVLMs, we find that they suffer from object hallucinations, i.e., they tend to generate objects inconsistent with the target images in the descriptions. To investigate it, this work presents the first systematic study on object hallucination of LVLMs. We conduct the evaluation experiments on several representative LVLMs, and show that they mostly suffer from severe object hallucination issues. We further discuss that the visual instructions may…

Citation impact

347

total citations

FWCI: 39.41
Percentile: 100%
References: 37

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Hallucinating
Object (grammar)
Computer science
Visual Hallucination
Artificial intelligence
Polling
Computer vision
Natural language processing

No related works found for this paper.

Funding

NN
National Natural Science Foundation of China
Awards: 62222215, L233008