Evaluating Object Hallucination in Large Vision-Language Models
Beijing Institute of Big Data Research · Renmin University of China
Abstract
Inspired by the superior language abilities of large language models (LLM), large vision-language models (LVLM) have been recently proposed by integrating powerful LLMs for improving the performance on complex multimodal tasks. Despite the promising progress on LVLMs, we find that they suffer from object hallucinations, i.e., they tend to generate objects inconsistent with the target images in the descriptions. To investigate it, this work presents the first systematic study on object hallucination of LVLMs. We conduct the evaluation experiments on several representative LVLMs, and show that they mostly suffer from severe object hallucination issues. We further discuss that the visual instructions may…
Citation impact
- FWCI
- 39.41
- Percentile
- 100%
- References
- 37
Authors
6Topics & keywords
- Hallucinating
- Object (grammar)
- Computer science
- Visual Hallucination
- Artificial intelligence
- Polling
- Computer vision
- Natural language processing