Multi-Concept Customization of Text-to-Image Diffusion
Carnegie Mellon University · Tsinghua University · +1 more institution
Abstract
While generative models produce high-quality images of concepts learned from a large-scale database, a user often wishes to synthesize instantiations of their own concepts (for example, their family, pets, or items). Can we teach a model to quickly acquire a new concept, given a few examples? Furthermore, can we compose multiple new concepts together? We propose Custom Diffusion, an efficient method for augmenting existing text-to-image models. We find that only optimizing a few parameters in the text-to-image conditioning mechanism is sufficiently powerful to represent new concepts while enabling fast tuning (~ 6 minutes). Additionally, we can jointly train for multiple concepts or combine multiple fine-tuned…
Citation impact
- FWCI
- 63.79
- Percentile
- 100%
- References
- 133
Authors
5Topics & keywords
- Computer science
- Personalization
- Generative model
- Image (mathematics)
- Generative grammar
- Artificial intelligence
- Quality (philosophy)
- Scale (ratio)