articleDec 1, 2015Closed access

Bilinear CNN Models for Fine-Grained Visual Recognition

University of Massachusetts Amherst

Indexed incrossref

Abstract

We propose bilinear models, a recognition architecture that consists of two feature extractors whose outputs are multiplied using outer product at each location of the image and pooled to obtain an image descriptor. This architecture can model local pairwise feature interactions in a translationally invariant manner which is particularly useful for fine-grained categorization. It also generalizes various orderless texture descriptors such as the Fisher vector, VLAD and O2P. We present experiments with bilinear models where the feature extractors are based on convolutional neural networks. The bilinear form simplifies gradient computation and allows end-to-end training of both networks using image labels only.…

Citation impact

2,059
total citations
FWCI
60.45
Percentile
100%
References
59
Citations per year

Authors

3

Topics & keywords

Keywords
  • Bilinear interpolation
  • Computer science
  • Convolutional neural network
  • Pattern recognition (psychology)
  • Pairwise comparison
  • Artificial intelligence
  • Feature (linguistics)
  • Computation
No related works found for this paper.