Multi-class Token Transformer for Weakly Supervised Semantic Segmentation

The University of Western Australia · The University of Sydney · +2 more institutions

Indexed incrossref

Abstract

This paper proposes a new transformer-based framework to learn class-specific object localization maps as pseudo labels for weakly supervised semantic segmentation (WSSS). Inspired by the fact that the attended regions of the one-class token in the standard vision transformer can be leveraged to form a class-agnostic localization map, we investigate if the transformer model can also effectively capture class-specific attention for more discriminative object localization by learning multiple class tokens within the transformer. To this end, we propose a Multi-class Token Transformer, termed as MCTformer, which uses multiple class tokens to learn interactions between the class tokens and the patch tokens. The…

Citation impact

278
total citations
FWCI
15.75
Percentile
100%
References
68
Citations per year

Authors

5

Topics & keywords

Keywords
  • Security token
  • Computer science
  • Discriminative model
  • Artificial intelligence
  • Transformer
  • Segmentation
  • Class (philosophy)
  • Pattern recognition (psychology)
UN Sustainable Development Goals
  • Reduced inequalities
No related works found for this paper.

Funding