LiT: Zero-Shot Transfer with Locked-image text Tuning

Zhai, Xiaohua; Wang, Xiao; Mustafa, Basil; Steiner, Andreas; Keysers, Daniel; Kolesnikov, Alexander; Beyer, Lucas

doi:10.1109/cvpr52688.2022.01759

article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022Closed access

LiT: Zero-Shot Transfer with Locked-image text Tuning

XZXiaohua Zhai XWXiao Wang BMBasil Mustafa ASAndreas Steiner DKDaniel Keysers

Google (Switzerland)

Indexed incrossref

Abstract

This paper presents contrastive-tuning, a simple method employing contrastive training to align image and text mod-els while still taking advantage of their pre-training. In our empirical study we find that locked pre-trained image mod-els with unlocked text models work best. We call this in-stance of contrastive-tuning “Locked-image Tuning” (LiT), which just teaches a text model to read out good repre-sentations from a pre-trained image model for new tasks. A LiT model gains the capability of zero-shot transfer to new vision tasks, such as image classification or retrieval. The proposed LiT is widely applicable; it works reliably with multiple pre-training methods (supervised and unsu-pervised) and across…

Citation impact

350

total citations

FWCI: 40.23
Percentile: 100%
References: 104

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Zero (linguistics)
Shot (pellet)
Computer science
Image (mathematics)
Transfer (computing)
Computer vision
Artificial intelligence
Materials science

No related works found for this paper.