Multi-Concept Customization of Text-to-Image Diffusion

Kumari, Nupur; Zhang, Bingliang; Zhang, Richard; Shechtman, Eli; Zhu, Jun-Yan

doi:10.1109/cvpr52729.2023.00192

articleJun 1, 2023Closed access

Multi-Concept Customization of Text-to-Image Diffusion

NKNupur Kumari BZBingliang Zhang RZRichard Zhang ESEli Shechtman JZJun-Yan Zhu

Carnegie Mellon University · Tsinghua University · +1 more institution

Indexed incrossref

Abstract

While generative models produce high-quality images of concepts learned from a large-scale database, a user often wishes to synthesize instantiations of their own concepts (for example, their family, pets, or items). Can we teach a model to quickly acquire a new concept, given a few examples? Furthermore, can we compose multiple new concepts together? We propose Custom Diffusion, an efficient method for augmenting existing text-to-image models. We find that only optimizing a few parameters in the text-to-image conditioning mechanism is sufficiently powerful to represent new concepts while enabling fast tuning (~ 6 minutes). Additionally, we can jointly train for multiple concepts or combine multiple fine-tuned…

Citation impact

557

total citations

FWCI: 63.79
Percentile: 100%
References: 133

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Computer science
Personalization
Generative model
Image (mathematics)
Generative grammar
Artificial intelligence
Quality (philosophy)
Scale (ratio)

No related works found for this paper.