Class-wise Visual Explanations for Deep Neural Networks

Minhao Cheng; Zeyu Qin

Class-wise Visual Explanations for Deep Neural Networks

Minhao Cheng, Zeyu Qin

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Class-wise explanation, Backdoor attack detection, Global explanation

Abstract: Many explainable AI (XAI) methods have been proposed to interpret neural net- work’s decisions on why they predict what they predict locally through gradient information. Yet, existing works mainly for local explanation lack global knowledge to show class-wise explanations in the whole training procedure. To fill this gap, we proposed to visualize global explanation in the input space for every class learned in the training procedure. Specifically, our solution finds a representation set that could demonstrate the learned knowledge for each class. To achieve this goal, we optimize the representation set by imitating the model training procedure over the full dataset. Experimental results show that our method could generate class-wise explanations with high quality in a series of image classification datasets. Using our global explanation, we further analyze the model knowledge in different training procedures, including adversarial training and noisy label learning. Moreover, we illustrate that the generated explanations could lend insights into diagnosing model failures, such as revealing triggers in a backdoored model.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)

TL;DR: We propose a method to visualize global explanation in the input space for every class learned in the training procedure.

5 Replies

Loading