preprintarXiv (Cornell University)Oct 5, 2022GREEN OA

Imagen Video: High Definition Video Generation with Diffusion Models

Indexed inarxivdatacite

Abstract

We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models. Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models. We describe how we scale up the system as a high definition text-to-video model including design decisions such as the choice of fully-convolutional temporal and spatial super-resolution models at certain resolutions, and the choice of the v-parameterization of diffusion models. In addition, we confirm and transfer findings from previous work on diffusion-based image generation to the video generation setting.…

Citation impact

346
total citations
FWCI
Percentile
References
0
Citations per year

Authors

11

Topics & keywords

Keywords
  • Computer science
  • Video tracking
  • Video compression picture types
  • Video processing
  • Artificial intelligence
  • Computer vision
  • Video quality
No related works found for this paper.