3D Human Pose Estimation with Spatio-Temporal Criss-Cross Attention
University of Science and Technology of China · Hefei University of Technology
Abstract
Recent transformer-based solutions have shown great success in 3D human pose estimation. Nevertheless, to calculate the joint-to-joint affinity matrix, the computational cost has a quadratic growth with the increasing number of joints. Such drawback becomes even worse especially for pose estimation in a video sequence, which necessitates spatio-temporal correlation spanning over the entire video. In this paper, we facilitate the issue by decomposing correlation learning into space and time, and present a novel Spatio-Temporal Criss-cross attention (STC) block. Technically, STC first slices its input feature into two partitions evenly along the channel dimension, followed by performing spatial and temporal…
Citation impact
- FWCI
- 21.08
- Percentile
- 100%
- References
- 58
Authors
5- ZTZhenhua TangCorresponding
University of Science and Technology of China, Hefei University of Technology
- ZQZhaofan Qiu
University of Science and Technology of China, Hefei University of Technology
- YHYanbin Hao
Hefei University of Technology, University of Science and Technology of China
- RHRichang Hong
University of Science and Technology of China, Hefei University of Technology
- TYTing Yao
Hefei University of Technology, University of Science and Technology of China
Topics & keywords
- Embedding
- Computer science
- Subspace topology
- Artificial intelligence
- Pose
- Pattern recognition (psychology)
- Algorithm