GradCAM is a technique for explaining the output of a vision system by checking the gradients in the ๐Ÿ‘๏ธ Convolutional Neural Network layers to produce a localization of the outputโ€”a heatmap explaining which part of the image most contributed to the output.

The key idea is that for some target class , we can compute the gradient of the score (pre-softmax) with respect to a layerโ€™s feature map activations . We can then weigh the feature maps by this gradient to get the localization map. Formally, we compute importance weights via global average pooling,

and compute the localization for a class as

Intuitively, is high when strongly influences the output; that is, if we change , will change a lot. Therefore, multiplying the averaged gradient with these features gives us a signal for how the activations affect our outputโ€”positive for parts that contribute to the target and negative otherwise. After the weighted sum, we take the ReLU to filter out any negative influences, reducing noise and leaving only strong positive signals from the activations.

Applications

GradCAM is an extremely versatile technique that can be used for a variety of applications in prediction and interpretation.

  1. Weighting guided backpropagation with GradCAM weights gives us more localized pixel-wise explanations (Guided GradCAM).
  2. Taking the negative of the gradients for the weights gives us counterfactual localizationโ€”regions which, if removed, would increase confidence for our target.
  3. Combining GradCAM output with weakly-supervised methods can give localization bounding boxes or segmentation masks.
  4. Visualized localizations can be used to increase trust, debug failure modes, and find biases.