Grad-CAM++: Generalized Gradient-based Visual Explanations for Deep Convolutional Networks

Aditya Chattopadhay* Anirban Sarkar* Prantik Howlader* Vineeth N Balasubramanian

* represents equal participation

In WACV 2018

Abstract

Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision based problems. However, deep models are perceived as "black box" methods considering the lack of understanding of their internal functioning. There has been a significant recent interest to develop explainable deep learning models, and this paper is an effort in this direction. Building on a recently proposed method called Grad-CAM, we propose Grad-CAM++ to provide better visual explanations of CNN model predictions (when compared to Grad-CAM), in terms of better localization of objects as well as explaining occurrences of multiple objects of a class in a single image. We provide a mathematical explanation for the proposed method, Grad-CAM++, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the class label under consideration. Our extensive experiments and evaluations, both subjective and objective, on standard datasets showed that Grad-CAM++ indeed provides better visual explanations for a given CNN architecture when compared to Grad-CAM.

Paper

arXiv:1710.11063, 2017.

Object Localization

Some visual examples depicting the object localization capabilities of both Grad-CAM and Grad-CAM++. The results are for E^c(δ = 0.25). The green boxes represent ground truth annotations for the images.

Generating Explanation Maps for different architectures

Explanation maps E^c for images generated by Grad-CAM and Grad-CAM++. These explanations are for the AlexNet architecture.

Explanation maps E^c for images generated by Grad-CAM and Grad-CAM++. These explanations are for the Resnet architecture.

Citations

Chattopadhyay, A., Sarkar, A., Howlader, P. and Balasubramanian, V.N., 2017. Grad-CAM++: Generalized Gradient-based Visual Explanations for Deep Convolutional Networks. arXiv preprint arXiv:1710.11063.

Bibtex: @article{chattopadhyay2017grad, title={Grad-CAM++: Generalized Gradient-based Visual Explanations for Deep Convolutional Networks}, author={Chattopadhyay, Aditya and Sarkar, Anirban and Howlader, Prantik and Balasubramanian, Vineeth N}, journal={arXiv preprint arXiv:1710.11063}, year={2017} }