preprintarXiv (Cornell University)Jun 9, 2014GREEN OA

Two-Stream Convolutional Networks for Action Recognition in Videos

University of Oxford

Indexed inarxivdatacite

Abstract

We investigate architectures of discriminatively trained deep Convolutional Networks (ConvNets) for action recognition in video. The challenge is to capture the complementary information on appearance from still frames and motion between frames. We also aim to generalise the best performing hand-crafted features within a data-driven learning framework. Our contribution is three-fold. First, we propose a two-stream ConvNet architecture which incorporates spatial and temporal networks. Second, we demonstrate that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data. Finally, we show that multi-task learning, applied to two different action…

Citation impact

5,366
total citations
FWCI
Percentile
References
30
Citations per year

Authors

2

Topics & keywords

Keywords
  • Action recognition
  • Computer science
  • Action (physics)
  • Convolutional neural network
  • Artificial intelligence
  • Pattern recognition (psychology)
  • Physics
UN Sustainable Development Goals
  • Reduced inequalities
No related works found for this paper.