Conditional Prompt Learning for Vision-Language Models

Zhou, Kaiyang; Yang, Jingkang; Loy, Chen Change; Liu, Ziwei

doi:10.1109/cvpr52688.2022.01631

article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022Closed access

Conditional Prompt Learning for Vision-Language Models

KZKaiyang Zhou JYJingkang Yang CCChen Change Loy ZLZiwei Liu

Nanyang Technological University

Indexed incrossref

Abstract

With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential to investigate ways to adapt these models to downstream datasets. A recently proposed method named Context Optimization (CoOp) introduces the concept of prompt learning—a recent trend in NLP—to the vision domain for adapting pre-trained vision-language models. Specifically, CoOp turns context words in a prompt into a set of learnable vectors and, with only a few labeled images for learning, can achieve huge improvements over intensively-tuned manual prompts. In our study we identify a critical problem of CoOp: the learned context is not generalizable to wider unseen classes within the same dataset, suggesting that CoOp…

Citation impact

1,466

total citations

FWCI: 77.66
Percentile: 100%
References: 88

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Computer science
Artificial intelligence
Generalization
Machine learning
Context (archaeology)
Set (abstract data type)
Class (philosophy)
Code (set theory)

No related works found for this paper.