articleJul 1, 2017Closed access

Spatiotemporal Multiplier Networks for Video Action Recognition

Graz University of Technology · Austrian Academy of Sciences · +1 more institution

Indexed incrossref

Abstract

This paper presents a general ConvNet architecture for video action recognition based on multiplicative interactions of spacetime features. Our model combines the appearance and motion pathways of a two-stream architecture by motion gating and is trained end-to-end. We theoretically motivate multiplicative gating functions for residual networks and empirically study their effect on classification accuracy. To capture long-term dependencies we inject identity mapping kernels for learning temporal relationships. Our architecture is fully convolutional in spacetime and able to evaluate a video in a single forward pass. Empirical investigation reveals that our model produces state-of-the-art results on two…

Citation impact

699
total citations
FWCI
31.97
Percentile
100%
References
62
Citations per year

Authors

3

Topics & keywords

Keywords
  • Action recognition
  • Computer science
  • Artificial intelligence
  • Multiplicative function
  • Residual
  • Pattern recognition (psychology)
  • Architecture
  • Spacetime
UN Sustainable Development Goals
  • Sustainable cities and communities
No related works found for this paper.

Funding