WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts

Yong Hyun Ahn, Hyeon Bae Kim, Seong Tae Kim

Published: 20 Jun 2024, Last Modified: 29 Sept 2024CVPR 2024EveryoneCC BY 4.0

Abstract: Recent advancements in neural networks have showcased their remarkable capabilities across various domains. Despite these successes, the “black box” problem still remains. To address this, we propose a novel framework, WWW, that offers the ‘what’, ‘where’, and ‘why’ of the neural network decisions in human-understandable terms. Specifically, WWW utilizes adaptive selection for concept discovery, employing adaptive cosine similarity and thresholding techniques to effectively explain ‘what’. To address the ‘where’ and ‘why’, we proposed a novel combination of neuron activation maps (NAMs) with Shapley values, generating localized concept maps and heatmaps for individual inputs. Furthermore, WWW introduces a method for predicting uncertainty, leveraging heatmap similarities to estimate the prediction’s reliability. Experimental evaluations of WWW demonstrate superior performance in both quantitative and qualitative metrics, outperforming existing methods in interpretability. WWW provides a unified solution for explaining ‘what’, ‘where’, and ‘why’, introducing a method for localized explanations from global interpretations and offering a plug-and-play solution adaptable to various architectures. Code is available at: https://github.com/ailab-kyunghee/WWW