Learning Spatiotemporal Features with 3D Convolutional Networks

Tran, Du; Bourdev, Lubomir; Fergus, Rob; Torresani, Lorenzo; Paluri, Manohar

doi:10.1109/iccv.2015.510

preprintDec 1, 2015Closed access

Learning Spatiotemporal Features with 3D Convolutional Networks

DTDu Tran LBLubomir Bourdev RFRob Fergus LTLorenzo Torresani MPManohar Paluri

Meta (Israel) · Dartmouth Hospital · +1 more institution

Indexed incrossref

Abstract

We propose a simple, yet effective approach for spatiotemporal feature learning using deep 3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised video dataset. Our findings are three-fold: 1) 3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets, 2) A homogeneous architecture with small 3x3x3 convolution kernels in all layers is among the best performing architectures for 3D ConvNets, and 3) Our learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks. In addition, the features are…

Citation impact

9,650

total citations

FWCI: 265.76
Percentile: 100%
References: 67

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Computer science
Artificial intelligence
Convolutional neural network
Classifier (UML)
Pattern recognition (psychology)
Convolution (computer science)
Inference
Deep learning

No related works found for this paper.