Prompt-to-Prompt Image Editing with Cross Attention Control

Hertz, Amir; Mokady, Ron; Tenenbaum, Jay M.; Aberman, Kfir; Pritch, Yael; Cohen‐Or, Daniel

doi:10.48550/arxiv.2208.01626

preprintarXiv (Cornell University)Aug 2, 2022GREEN OA

Prompt-to-Prompt Image Editing with Cross Attention Control

AHAmir Hertz RMRon Mokady JMJay M. Tenenbaum KAKfir Aberman YPYael Pritch

Indexed inarxivdatacite

Abstract

Recent large-scale text-driven synthesis models have attracted much attention thanks to their remarkable capabilities of generating highly diverse images that follow given text prompts. Such text-based synthesis methods are particularly appealing to humans who are used to verbally describe their intent. Therefore, it is only natural to extend the text-driven image synthesis to text-driven image editing. Editing is challenging for these generative models, since an innate property of an editing technique is to preserve most of the original image, while in the text-based models, even a small modification of the text prompt often leads to a completely different outcome. State-of-the-art methods mitigate this by…

Citation impact

363

total citations

FWCI: —
Percentile: —
References: 0

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Image editing
Computer science
Image (mathematics)
Word (group theory)
Image synthesis
Fidelity
Generative grammar
Key (lock)

No related works found for this paper.