Abstract

Zero-shot semantic segmentation (ZS3) aims to segment the novel categories that have not been seen in the training. Existing works formulate ZS3 as a pixel-level zeroshot classification problem, and transfer semantic knowledge from seen classes to unseen ones with the help of language models pre-trained only with texts. While simple, the pixel-level ZS3 formulation shows the limited capability to integrate vision-language models that are often pre-trained with image-text pairs and currently demonstrate great potential for vision tasks. Inspired by the observation that humans often perform segment-level semantic labeling, we propose to decouple the ZS3 into two sub-tasks: 1) a classagnostic grouping task to…

Citation impact

260
total citations
FWCI
24.51
Percentile
100%
References
79
Citations per year

Authors

4

Topics & keywords

Keywords
  • Computer science
  • Segmentation
  • Leverage (statistics)
  • Artificial intelligence
  • Pixel
  • Decoupling (probability)
  • Task (project management)
  • Pattern recognition (psychology)
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.