STM: SpatioTemporal and Motion Encoding for Action Recognition

Jiang, Boyuan; Wang, Mengmeng; Gan, Weihao; Wu, Wei; Yan, Junjie

doi:10.1109/iccv.2019.00209

articleOct 1, 2019Closed access

STM: SpatioTemporal and Motion Encoding for Action Recognition

BJBoyuan Jiang MWMengmeng Wang WGWeihao Gan WWWei Wu JYJunjie Yan

Zhejiang University · Group Sense (China)

Indexed incrossref

Abstract

Spatiotemporal and motion features are two complementary and crucial information for video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn spatiotemporal features and another flow stream to learn motion features. In this work, we aim to efficiently encode these two features in a unified 2D framework. To this end, we first propose a STM block, which contains a Channel-wise SpatioTemporal Module (CSTM) to present the spatiotemporal features and a Channel-wise Motion Module (CMM) to efficiently encode motion features. We then replace original residual blocks in the ResNet architecture with STM blcoks to form a simple yet effective STM network by introducing very limited extra…

Citation impact

455

total citations

FWCI: 25.72
Percentile: 100%
References: 58

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Computer science
ENCODE
Encoding (memory)
Artificial intelligence
Motion (physics)
Block (permutation group theory)
Action recognition
Computation

No related works found for this paper.