Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs

Shou, Zheng; Wang, Dongang; Chang, Shih‐Fu

doi:10.1109/cvpr.2016.119

preprintJun 1, 2016Closed access

Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs

ZSZheng Shou DWDongang Wang SCShih‐Fu Chang

Columbia University

Indexed incrossref

Abstract

We address temporal action localization in untrimmed long videos. This is important because videos in real applications are usually unconstrained and contain multiple action instances plus video content of background scenes or other activities. To address this challenging issue, we exploit the effectiveness of deep networks in temporal action localization via three segment-based 3D ConvNets: (1) a proposal network identifies candidate segments in a long video that may contain actions, (2) a classification network learns one-vs-all action classification model to serve as initialization for the localization network, and (3) a localization network fine-tunes the learned classification network to localize each…

Citation impact

953

total citations

FWCI: 46.97
Percentile: 100%
References: 56

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Initialization
Computer science
Artificial intelligence
Exploit
Pattern recognition (psychology)
Action (physics)

No related works found for this paper.