articleJun 16, 2024Closed access

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Tencent (China)

Indexed incrossref

Abstract

Text-to-video generation aims to produce a video based on a given prompt. Recently, several commercial video models have been able to generate plausible videos with mini-mal noise, excellent details, and high aesthetic scores. However, these models rely on large-scale, well-filtered, high-quality videos that are not accessible to the community. Many existing research works, which train models using the low-quality WebVid-10M dataset, struggle to generate high-quality videos because the models are optimized to fit WebVid-10M. In this work, we explore the training scheme of video models extended from Stable Diffusion and investigate the feasibility of leveraging low-quality videos and synthesized high-quality…

Citation impact

163
total citations
FWCI
36.54
Percentile
100%
References
81
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Diffusion
  • Quality (philosophy)
  • Data modeling
  • Database
No related works found for this paper.