Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Microsoft Research Asia (China) · Microsoft (United States) · +2 more institutions

PubMed
Indexed incrossrefpubmed

Abstract

Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 × 224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a…

Citation impact

11,356
total citations
FWCI
214.11
Percentile
100%
References
70
Citations per year

Authors

4

Topics & keywords

Keywords
  • Pooling
  • Pascal (unit)
  • Artificial intelligence
  • Computer science
  • Convolutional neural network
  • Pattern recognition (psychology)
  • Pyramid (geometry)
  • Contextual image classification
No related works found for this paper.