Learning Spatiotemporal Features with 3D Convolutional Networks
Meta (Israel) · Dartmouth Hospital · +1 more institution
Abstract
We propose a simple, yet effective approach for spatiotemporal feature learning using deep 3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised video dataset. Our findings are three-fold: 1) 3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets, 2) A homogeneous architecture with small 3x3x3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets, and 3) Our learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks. In addition, the features are…
Citation impact
- FWCI
- 265.76
- Percentile
- 100%
- References
- 67
Authors
5Topics & keywords
- Computer science
- Artificial intelligence
- Convolutional neural network
- Classifier (UML)
- Pattern recognition (psychology)
- Convolution (computer science)
- Inference
- Deep learning