DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Boston University · Google (United States)
Abstract
Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-quality and diverse synthesis of images from a given text prompt. However, these models lack the ability to mimic the appearance of subjects in a given reference set and synthesize novel renditions of them in different contexts. In this work, we present a new approach for “personalization” of text-to-image diffusion models. Given as input just a few images of a subject, we fine-tune a pretrained text-to-image model such that it learns to bind a unique identifier with that specific subject. Once the subject is embedded in the output domain of the model, the unique identifier can be used to synthesize novel photorealistic…
Citation impact
- FWCI
- 218.69
- Percentile
- 100%
- References
- 107
Authors
6Topics & keywords
- Computer science
- Diffusion
- Subject (documents)
- Image (mathematics)
- Artificial intelligence
- Computer vision
- Physics
- World Wide Web
- Sustainable cities and communities