MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

Li, Wenhao; Liu, Hong; Tang, Hao; Wang, Pichao; Gool, Luc Van

doi:10.1109/cvpr52688.2022.01280

article2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Jun 1, 2022GREEN OA

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

WLWenhao Li HLHong Liu HTHao Tang PWPichao Wang LVLuc Van Gool

Peking University · ETH Zurich · +1 more institution

Indexed incrossref

Abstract

Estimating 3D human poses from monocular videos is a challenging task due to depth ambiguity and self-occlusion. Most existing works attempt to solve both issues by exploiting spatial and temporal relationships. However, those works ignore the fact that it is an inverse problem where multiple feasible solutions (i.e., hypotheses) exist. To relieve this limitation, we propose a Multi-Hypothesis Transformer (MHFormer) that learns spatio-temporal representations of multiple plausible pose hypotheses. In order to effectively model multi-hypothesis dependencies and build strong relationships across hypothesis features, the task is decomposed into three stages: (i) Generate multiple initial hypothesis…

Citation impact

409

total citations

FWCI: 22.54
Percentile: 100%
References: 59

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Computer science
Ambiguity
Artificial intelligence
Merge (version control)
Transformer
Machine learning
Intuition
Pattern recognition (psychology)

No related works found for this paper.

Funding

NK
National Key Research and Development Program of China
Award: 2020AAA0108904