Action recognition with trajectory-pooled deep-convolutional descriptors
Shenzhen Institutes of Advanced Technology · Chinese University of Hong Kong
Abstract
Visual features are of vital importance for human action understanding in videos. This paper presents a new video representation, called trajectory-pooled deep-convolutional descriptor (TDD), which shares the merits of both hand-crafted features [31] and deep-learned features [24]. Specifically, we utilize deep architectures to learn discriminative convolutional feature maps, and conduct trajectory-constrained pooling to aggregate these convolutional features into effective descriptors. To enhance the robustness of TDDs, we design two normalization methods to transform convolutional feature maps, namely spatiotemporal normalization and channel normalization. The advantages of our features come from (i) TDDs…
Citation impact
- FWCI
- 103.69
- Percentile
- 100%
- References
- 63
Authors
3Topics & keywords
- Discriminative model
- Artificial intelligence
- Computer science
- Pooling
- Normalization (sociology)
- Convolutional neural network
- Pattern recognition (psychology)
- Deep learning
- Reduced inequalities