articleJun 16, 2024Closed access

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Nankai University · Tencent (China)

Indexed incrossref

Abstract

Recent advances in text-to-image generation have made remarkable progress in synthesizing realistic human photos conditioned on given text prompts. However, existing per-sonalized generation methods cannot simultaneously sat-isfy the requirements of high efficiency, promising identity (ID) fidelity, and flexible text controllability. In this work, we introduce PhotoMaker, an efficient personalized text-to-image generation method, which mainly encodes an arbitrary number of input ID images into a stack ID embed-ding for preserving ID information. Such an embedding, serving as a unified ID representation, can not only encap-sulate the characteristics of the same input ID comprehen-sively, but also accommodate…

Citation impact

108
total citations
FWCI
24.41
Percentile
100%
References
105
Citations per year

Authors

6

Topics & keywords

Keywords
  • Computer science
  • Embedding
  • Computer graphics (images)
  • Artificial intelligence
No related works found for this paper.

Funding