articleACM Transactions on GraphicsJul 26, 2023Closed access

Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models

Tel Aviv University

Indexed incrossref

Abstract

Recent text-to-image generative models have demonstrated an unparalleled ability to generate diverse and creative imagery guided by a target text prompt. While revolutionary, current state-of-the-art diffusion models may still fail in generating images that fully convey the semantics in the given text prompt. We analyze the publicly available Stable Diffusion model and assess the existence of catastrophic neglect , where the model fails to generate one or more of the subjects from the input prompt. Moreover, we find that in some cases the model also fails to correctly bind attributes ( e.g. , colors) to their corresponding subjects. To help mitigate these failure cases, we introduce the concept of Generative…

Citation impact

362
total citations
FWCI
41.12
Percentile
100%
References
20
Citations per year

Authors

5

Topics & keywords

Keywords
  • Computer science
  • Generative grammar
  • Generative model
  • Inference
  • Process (computing)
  • Image (mathematics)
  • Semantics (computer science)
  • Artificial intelligence
No related works found for this paper.