AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models

Gu, Zhaopeng; Zhu, Bingke; Zhu, Guibo; Chen, Yingying; Tang, Ming; Wang, Jinqiao

doi:10.1609/aaai.v38i3.27963

articleProceedings of the AAAI Conference on Artificial IntelligenceMar 24, 2024DIAMOND OA

AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models

ZGZhaopeng Gu BZBingke Zhu GZGuibo Zhu YCYingying Chen MTMing Tang

Institute of Automation · University of Chinese Academy of Sciences

Indexed incrossref

Abstract

Large Vision-Language Models (LVLMs) such as MiniGPT-4 and LLaVA have demonstrated the capability of understanding images and achieved remarkable performance in various visual tasks. Despite their strong abilities in recognizing common objects due to extensive training datasets, they lack specific domain knowledge and have a weaker understanding of localized details within objects, which hinders their effectiveness in the Industrial Anomaly Detection (IAD) task. On the other hand, most existing IAD methods only provide anomaly scores and necessitate the manual setting of thresholds to distinguish between normal and abnormal samples, which restricts their practical implementation. In this paper, we explore the…

Citation impact

194

total citations

FWCI: 26.24
Percentile: 100%
References: 48

Citations per year

Authors

6

Topics & keywords

Topics

Anomaly Detection Techniques and Applications98%

Keywords

Artificial intelligence
Computer science
Computer vision
Natural language processing

No related works found for this paper.

Funding

NN
National Natural Science Foundation of China
Awards: 61976210, 62006230, 62276260, 62076235