3D Human Pose Estimation with Spatio-Temporal Criss-Cross Attention

Tang, Zhenhua; Qiu, Zhaofan; Hao, Yanbin; Hong, Richang; Yao, Ting

doi:10.1109/cvpr52729.2023.00464

articleJun 1, 2023Closed access

3D Human Pose Estimation with Spatio-Temporal Criss-Cross Attention

ZTZhenhua Tang ZQZhaofan Qiu YHYanbin Hao RHRichang Hong TYTing Yao

University of Science and Technology of China · Hefei University of Technology

Indexed incrossref

Abstract

Recent transformer-based solutions have shown great success in 3D human pose estimation. Nevertheless, to calculate the joint-to-joint affinity matrix, the computational cost has a quadratic growth with the increasing number of joints. Such drawback becomes even worse especially for pose estimation in a video sequence, which necessitates spatio-temporal correlation spanning over the entire video. In this paper, we facilitate the issue by decomposing correlation learning into space and time, and present a novel Spatio-Temporal Criss-cross attention (STC) block. Technically, STC first slices its input feature into two partitions evenly along the channel dimension, followed by performing spatial and temporal…

Citation impact

185

total citations

FWCI: 21.08
Percentile: 100%
References: 58

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Embedding
Computer science
Subspace topology
Artificial intelligence
Pose
Pattern recognition (psychology)
Algorithm

No related works found for this paper.

Funding

NN
National Natural Science Foundation of China
Award: 61932009