Blended Diffusion for Text-driven Editing of Natural Images

Avrahami, Omri; Lischinski, Dani; Fried, Ohad

doi:10.1109/cvpr52688.2022.01767

article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022GREEN OA

Blended Diffusion for Text-driven Editing of Natural Images

OAOmri Avrahami DLDani Lischinski OFOhad Fried

Hebrew University of Jerusalem · Brandman University

Indexed inarxivcrossref

Abstract

Natural language offers a highly intuitive interface for image editing. In this paper, we introduce the first solution for performing local (region-based) edits in generic natural images, based on a natural language description along with an ROI mask. We achieve our goal by leveraging and combining a pretrained language-image model (CLIP), to steer the edit towards a user-provided text prompt, with a denoising diffusion probabilistic model (DDPM) to generate natural-looking results. To seamlessly fuse the edited region with the unchanged parts of the image, we spatially blend noised versions of the input image with the local text-guided diffusion latent at a progression of noise levels. In addition, we show…

Citation impact

683

total citations

FWCI: 37.71
Percentile: 100%
References: 77

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Computer science
Natural (archaeology)
Diffusion
Image editing
Computer graphics (images)
Artificial intelligence
World Wide Web
Multimedia

No related works found for this paper.

Funding

IS
Israel Science Foundation
Award: 2492/20,1574/21