Learning mid-level features for recognition
École Normale Supérieure - PSL · New York University · +3 more institutions
Abstract
Many successful models for scene or object recognition transform low-level descriptors (such as Gabor filter responses, or SIFT descriptors) into richer representations of intermediate complexity. This process can often be broken down into two steps: (1) a coding step, which performs a pointwise transformation of the descriptors into a representation better adapted to the task, and (2) a pooling step, which summarizes the coded features over larger neighborhoods. Several combinations of coding and pooling schemes have been proposed in the literature. The goal of this paper is threefold. We seek to establish the relative importance of each step of mid-level feature extraction through a comprehensive cross…
Citation impact
- FWCI
- 93.10
- Percentile
- 100%
- References
- 42
Authors
4- YBY-Lan BoureauCorresponding
École Normale Supérieure - PSL, New York University, Centre National de la Recherche Scientifique, Courant Institute of Mathematical Sciences, Institut national de recherche en informatique et en automatique
- FBFrancis Bach
Centre National de la Recherche Scientifique, École Normale Supérieure - PSL, Institut national de recherche en informatique et en automatique
- YLYann LeCun
Courant Institute of Mathematical Sciences, New York University
- JPJean Ponce
Topics & keywords
- Pooling
- Discriminative model
- Computer science
- Artificial intelligence
- Pattern recognition (psychology)
- Neural coding
- Coding (social sciences)
- Scale-invariant feature transform
- Reduced inequalities