Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
Georgia Institute of Technology · Meta (Israel)
Abstract
We propose a technique for producing `visual explanations' for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent. Our approach - Gradient-weighted Class Activation Mapping (Grad-CAM), uses the gradients of any target concept (say logits for `dog' or even a caption), flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept. Unlike previous approaches, Grad- CAM is applicable to a wide variety of CNN model-families: (1) CNNs with fully-connected layers (e.g. VGG), (2) CNNs used for structured outputs (e.g. captioning), (3) CNNs used in tasks with…
Citation impact
- FWCI
- 245.84
- Percentile
- 100%
- References
- 65
Authors
6Topics & keywords
- Closed captioning
- Computer science
- Discriminative model
- Convolutional neural network
- Artificial intelligence
- Visualization
- Generalization
- Question answering
- Reduced inequalities