PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation

Zhao, Qitao; Zheng, Ce; Liu, Mengyuan; Wang, Pichao; Chen, Chen

doi:10.1109/cvpr52729.2023.00857

articleJun 1, 2023Closed access

PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation

QZQitao Zhao CZCe Zheng MLMengyuan Liu PWPichao Wang CCChen Chen

Shandong University · Peking University · +2 more institutions

Indexed incrossref

Abstract

Recently, transformer-based methods have gained significant success in sequential 2D-to-3D lifting human pose estimation. As a pioneering work, PoseFormer captures spatial relations of human joints in each video frame and human dynamics across frames with cascaded transformer layers and has achieved impressive performance. However, in real scenarios, the performance of PoseFormer and its follow-ups is limited by two factors: (a) The length of the input joint sequence; (b) The quality of 2D joint detection. Existing methods typically apply self-attention to all frames of the input sequence, causing a huge computational burden when the frame number is increased to obtain advanced estimation accuracy, and they…

Citation impact

206

total citations

FWCI: 23.47
Percentile: 100%
References: 46

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Computer science
Robustness (evolution)
Frequency domain
Transformer
Artificial intelligence
Exploit
Joint (building)
Benchmark (surveying)

No related works found for this paper.