Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing; Sun, Jian

doi:10.1109/tpami.2015.2389824

articleIEEE Transactions on Pattern Analysis and Machine IntelligenceJan 9, 2015Closed access

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

KHKaiming He XZXiangyu Zhang SRShaoqing Ren JSJian Sun

Microsoft Research Asia (China) · Microsoft (United States) · +2 more institutions

PubMed

Indexed incrossrefpubmed

Abstract

Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 × 224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a…

Citation impact

11,356

total citations

FWCI: 214.11
Percentile: 100%
References: 70

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Pooling
Pascal (unit)
Artificial intelligence
Computer science
Convolutional neural network
Pattern recognition (psychology)
Pyramid (geometry)
Contextual image classification

No related works found for this paper.