Multi-class Token Transformer for Weakly Supervised Semantic Segmentation
The University of Western Australia · The University of Sydney · +2 more institutions
Abstract
This paper proposes a new transformer-based framework to learn class-specific object localization maps as pseudo labels for weakly supervised semantic segmentation (WSSS). Inspired by the fact that the attended regions of the one-class token in the standard vision transformer can be leveraged to form a class-agnostic localization map, we investigate if the transformer model can also effectively capture class-specific attention for more discriminative object localization by learning multiple class tokens within the transformer. To this end, we propose a Multi-class Token Transformer, termed as MCTformer, which uses multiple class tokens to learn interactions between the class tokens and the patch tokens. The…
Citation impact
- FWCI
- 15.75
- Percentile
- 100%
- References
- 68
Authors
5Topics & keywords
- Security token
- Computer science
- Discriminative model
- Artificial intelligence
- Transformer
- Segmentation
- Class (philosophy)
- Pattern recognition (psychology)
- Reduced inequalities