VLCIM: A Vision-Language Cyclic Interaction Model for Industrial Defect Detection
Beihang University · State Key Laboratory of Synthetical Automation for Process Industries
Abstract
Accurate defect detection is an important element in ensuring product quality and safe equipment operation. However, due to the lack of deep cross-modal interactions during vision feature extraction, existing methods often suffer from attention bias, which ultimately limits detection accuracy. To address this issue, this paper proposes a Vision-Language Cyclic Interaction Model (VLCIM), which progressively optimizes vision feature extraction by integrating domain prior knowledge and generic large model, effectively bridging the dual-domain barrier between “generic-specific” and “vision-language”. Specifically, progressive cyclic interaction learning is proposed for the first time, which integrates a recursive…
Citation impact
- FWCI
- 43.13
- Percentile
- 100%
- References
- 41
Authors
7Topics & keywords
- Computer vision
- Computer science
- Artificial intelligence
- Machine vision
- Natural language processing