TSM: Temporal Shift Module for Efficient Video Understanding
Massachusetts Institute of Technology · Moscow Institute of Thermal Technology · +1 more institution
Abstract
The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost. Conventional 2D CNNs are computationally cheap but cannot capture temporal relationships; 3D CNN based methods can achieve good performance but are computationally intensive, making it expensive to deploy. In this paper, we propose a generic and effective Temporal Shift Module (TSM) that enjoys both high efficiency and high performance. Specifically, it can achieve the performance of 3D CNN but maintain 2D CNN's complexity. TSM shifts part of the channels along the temporal dimension; thus facilitate information exchanged among neighboring frames. It can be inserted into…
Citation impact
- FWCI
- 92.37
- Percentile
- 100%
- References
- 103
Authors
3Topics & keywords
- Computer science
- Computation
- Latency (audio)
- Low latency (capital markets)
- Video tracking
- Artificial intelligence
- Computer engineering
- Video processing