Abstract
Applying convolutional neural networks to large images is computationally ex-pensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is ca-pable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. Like convolutional neural networks, the proposed model has a degree of translation invariance built-in, but the amount of computation it per-forms can be controlled independently of the input image size. While the model is non-differentiable, it can be trained using reinforcement learning methods to learn…
Citation impact
- FWCI
- —
- Percentile
- —
- References
- 22
Authors
4- VMVolodymyr MnihCorresponding
Google (United States), Google DeepMind (United Kingdom)
- NHNicolas Heess
Google (United States), Google DeepMind (United Kingdom)
- AGAlex Graves
Google (United States), Google DeepMind (United Kingdom)
- KKKoray Kavukcuoglu
Google (United States), Google DeepMind (United Kingdom)
Topics & keywords
- Computer science
- Artificial intelligence
- Convolutional neural network
- Computation
- Differentiable function
- Pattern recognition (psychology)
- Task (project management)
- Image (mathematics)