Two-Stream Convolutional Networks for Action Recognition in Videos

Simonyan, Karen; Zisserman, Andrew

preprintOxford University Research Archive (ORA) (University of Oxford)Jun 9, 2014GREEN OA

Two-Stream Convolutional Networks for Action Recognition in Videos

Abstract

We investigate architectures of discriminatively trained deep Convolutional Net-works (ConvNets) for action recognition in video. The challenge is to capture the complementary information on appearance from still frames and motion between frames. We also aim to incorporate into the network design aspects of the best performing hand-crafted features. Our contribution is three-fold. First, we propose a two-stream ConvNet architec-ture which incorporates spatial and temporal networks. Second, we demonstrate that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data. Finally, we show that multi-task learning, applied to two different action…

Citation impact

1,475

total citations

FWCI: —
Percentile: —
References: 22

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Computer science
Margin (machine learning)
Artificial intelligence
Optical flow
Action recognition
Frame (networking)
Deep learning
Motion (physics)

UN Sustainable Development Goals

Reduced inequalities

No related works found for this paper.

Funding

N
Nvidia