MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
The University of Tokyo · Tencent (China)
Abstract
Despite the success in large-scale text-to-image generation and text-conditioned image editing, existing methods still struggle to produce consistent generation and editing results. For example, generation approaches usually fail to synthesize multiple images of the same objects/characters but with different views or poses. Meanwhile, existing editing methods either fail to achieve effective complex nonrigid editing while maintaining the overall textures and identity, or require time-consuming fine-tuning to capture the image-specific appearance. In this paper, we develop MasaCtrl, a tuning-free method to achieve consistent image generation and complex non-rigid image editing simultaneously. Specifically,…
Citation impact
- FWCI
- 38.01
- Percentile
- 100%
- References
- 63
Authors
6Topics & keywords
- Image editing
- Computer science
- Image (mathematics)
- Confusion
- Artificial intelligence
- Consistency (knowledge bases)
- Computer vision