articleDec 1, 2015Closed access

Human Action Recognition Using Factorized Spatio-Temporal Convolutional Networks

Hong Kong University of Science and Technology · Lenovo (China) · +1 more institution

Indexed incrossref

Abstract

Human actions in video sequences are three-dimensional (3D) spatio-temporal signals characterizing both the visual appearance and motion dynamics of the involved humans and objects. Inspired by the success of convolutional neural networks (CNN) for image classification, recent attempts have been made to learn 3D CNNs for recognizing human actions in videos. However, partly due to the high complexity of training 3D convolution kernels and the need for large quantities of training videos, only limited success has been reported. This has triggered us to investigate in this paper a new deep architecture which can handle 3D signals more effectively. Specifically, we propose factorized spatio-temporal convolutional…

Citation impact

545
total citations
FWCI
29.95
Percentile
100%
References
50
Citations per year

Authors

4

Topics & keywords

Keywords
  • Action recognition
  • Computer science
  • Convolutional neural network
  • Action (physics)
  • Artificial intelligence
  • Pattern recognition (psychology)
  • Physics
No related works found for this paper.

Funding