Open-Vocabulary Camouflaged Object Segmentation with Cascaded Vision Language Models

Zhao, Kai; Yuan, Wubang; Wang, Zheng; Li, Guanyi; Zhu, Xiaoqiang; Fan, Deng-Ping; Zeng, Dan

doi:10.26599/cvm.2025.9450512

articleComputational Visual MediaJan 27, 2026DIAMOND OA

Open-Vocabulary Camouflaged Object Segmentation with Cascaded Vision Language Models

KZKai ZhaoWYWubang YuanZWZheng Wang GLGuanyi Li XZXiaoqiang Zhu

Shanghai University · Institute for Advanced Study

Indexed inarxivcrossrefdatacitedoaj

Abstract

Open-vocabulary camouflaged object segmentation (OVCOS) seeks to segment and classify camouflaged objects in arbitrary categories, presenting unique challenges due to visual ambiguity and unseen categories. Recent approaches typically adopt a two-stage paradigm: they first segment objects, and then classify the segmented regions using vision language models (VLMs). However, such methods (i) suffer from a domain gap caused by the mismatch between VLMs' full-image training and cropped-region inferencing, and (ii) depend on generic segmentation models optimized for well-delineated objects which are less effective for camouflaged objects. Without explicit guidance, generic segmentation models often overlook subtle…

Citation impact

5

total citations

FWCI: 108.09
Percentile: 100%
References: 0

Too recent for citation history.

Authors

7

KZ
Kai ZhaoCorresponding
Shanghai University
WY
Wubang Yuan
Shanghai University
ZW
Zheng Wang
Shanghai University
GL
Guanyi Li
Shanghai University
XZ
Xiaoqiang Zhu
Shanghai University

Topics & keywords

Topics

Keywords

Segmentation
Leverage (statistics)
Ambiguity
Object (grammar)
Context (archaeology)
Market segmentation
Scale-space segmentation
Semantics (computer science)

No related works found for this paper.

Funding

NN
National Natural Science Foundation of China
Award: 62372284,62476143