PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Nankai University · Tencent (China)
Abstract
Recent advances in text-to-image generation have made remarkable progress in synthesizing realistic human photos conditioned on given text prompts. However, existing per-sonalized generation methods cannot simultaneously sat-isfy the requirements of high efficiency, promising identity (ID) fidelity, and flexible text controllability. In this work, we introduce PhotoMaker, an efficient personalized text-to-image generation method, which mainly encodes an arbitrary number of input ID images into a stack ID embed-ding for preserving ID information. Such an embedding, serving as a unified ID representation, can not only encap-sulate the characteristics of the same input ID comprehen-sively, but also accommodate…
Citation impact
- FWCI
- 24.41
- Percentile
- 100%
- References
- 105
Authors
6Topics & keywords
- Computer science
- Embedding
- Computer graphics (images)
- Artificial intelligence