articleJun 1, 2023Closed access

SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Xi'an Jiaotong University · Tencent (China)

Indexed incrossref

Abstract

Generating talking head videos through a face image and a piece of speech audio still contains many challenges. i.e., unnatural head movement, distorted expression, and identity modification. We argue that these issues are mainly caused by learning from the coupled 2D motion fields. On the other hand, explicitly using 3D information also suffers problems of stiff expression and incoherent video. We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation. To learn the realistic motion coefficients, we explicitly model the connections between audio and different types of motion…

Citation impact

269
total citations
FWCI
30.62
Percentile
100%
References
65
Citations per year

Authors

8

Topics & keywords

Keywords
  • Computer science
  • Animation
  • Artificial intelligence
  • Face (sociological concept)
  • Motion (physics)
  • Computer vision
  • Expression (computer science)
  • Computer facial animation
No related works found for this paper.

Funding