MoMask: Generative Masked Modeling of 3D Human Motions

Guo, Chuan; Mu, Yuxuan; Javed, Muhammad Gohar; Wang, Sen; Cheng, Li

doi:10.1109/cvpr52733.2024.00186

articleJun 16, 2024Closed access

MoMask: Generative Masked Modeling of 3D Human Motions

CGChuan Guo YMYuxuan Mu MGMuhammad Gohar Javed SWSen Wang LCLi Cheng

University of Alberta

Indexed incrossref

Abstract

We introduce MoMask, a novel masked modeling framework for text-driven 3D human motion generation. In Mo-Mask, a hierarchical quantization scheme is employed to represent human motion as multi-layer discrete motion tokens with high-fidelity details. Starting at the base layer, with a sequence of motion tokens obtained by vector quan-tization, the residual tokens of increasing orders are de-rived and stored at the subsequent layers of the hierar-chy. This is consequently followed by two distinct bidirectional transformers. For the base-layer motion tokens, a Masked Transformer is designated to predict randomly masked motion tokens conditioned on text input at training stage. During generation (i. e. inference)…

Citation impact

121

total citations

FWCI: 36.31
Percentile: 100%
References: 57

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Generative grammar
Computer science
Artificial intelligence

No related works found for this paper.