article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022Closed access
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Indexed incrossref
Abstract
Recent progress has shown that large-scale pre-training using contrastive image-text pairs can be a promising alternative for high-quality visual representation learning from natural language supervision. Benefiting from a broader source of supervision, this new paradigm exhibits impressive transferability to downstream classification tasks and datasets. However, the problem of transferring the knowledge learned from image-text pairs to more complex dense prediction tasks has barely been visited. In this work, we present a new framework for dense prediction by implicitly and explicitly leveraging the pre-trained knowledge from CLIP. Specifically, we convert the original image-text matching problem in CLIP to a…
Citation impact
536
total citations
- FWCI
- 29.68
- Percentile
- 100%
- References
- 72
Citations per year
Authors
8Topics & keywords
Topics
Keywords
- Computer science
- Artificial intelligence
- Exploit
- Context (archaeology)
- Segmentation
- Natural language processing
- Matching (statistics)
- Machine learning
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.