articleOct 1, 2023Closed access
Structure and Content-Guided Video Synthesis with Diffusion Models
Indexed incrossref
Abstract
Text-guided generative diffusion models unlock powerful image creation and editing tools. Recent approaches that edit the content of footage while retaining structure require expensive re-training for every input or rely on error-prone propagation of image edits across frames.In this work, we present a structure and content-guided video diffusion model that edits videos based on descriptions of the desired output. Conflicts between user-provided content edits and structure representations occur due to insufficient disentanglement between the two aspects. As a solution, we show that training on monocular depth estimates with varying levels of detail provides control over structure and content fidelity. A novel…
Citation impact
342
total citations
- FWCI
- 38.84
- Percentile
- 100%
- References
- 55
Citations per year
Authors
5Topics & keywords
Topics
Keywords
- Computer science
- Consistency (knowledge bases)
- Personalization
- Fidelity
- Generative grammar
- Generative model
- Control (management)
- Variety (cybernetics)
No related works found for this paper.