Two-Stream Convolutional Networks for Action Recognition in Videos
Abstract
We investigate architectures of discriminatively trained deep Convolutional Net-works (ConvNets) for action recognition in video. The challenge is to capture the complementary information on appearance from still frames and motion between frames. We also aim to incorporate into the network design aspects of the best performing hand-crafted features. Our contribution is three-fold. First, we propose a two-stream ConvNet architec-ture which incorporates spatial and temporal networks. Second, we demonstrate that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data. Finally, we show that multi-task learning, applied to two different action…
Citation impact
1,475
total citations
- FWCI
- —
- Percentile
- —
- References
- 22
Citations per year
Authors
2Topics & keywords
Topics
Keywords
- Computer science
- Margin (machine learning)
- Artificial intelligence
- Optical flow
- Action recognition
- Frame (networking)
- Deep learning
- Motion (physics)
UN Sustainable Development Goals
- Reduced inequalities
No related works found for this paper.