DAME: A Distillation Based Approach For Model-agnostic Local Explainability

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: visualization or interpretation of learned representations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Post-hoc Explainability, Interpretability, Saliency
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: A new learnable post-hoc explainability model- Distillation Approach for Model-agnostic Explainability (DAME), that succesfully attempts to generate saliency while operating on high dimensional input image space itself, unlike existing local methods.
Abstract: The frameworks for explaining the functional space learned by deep neural networks, also known as eXplainable AI (XAI) models, are majorly based on the notion of the locality. Most of the approaches for local model-agnostic explainability employ linear models. Driven by the fact that a linear model is inherently interpretable (linear coefficients being the explanation), they are used to approximate the non-linear function locally. In this paper, we argue that local linear approximation is inapt as the black boxes under investigation are often highly non linear. We present a novel perturbation-based approach for local explainability, called the Distillation Approach for Model-agnostic Explainability (DAME). It separates out the two tasks- local approximation and generating explanation, and successfully attempts generating explanations by operating on high dimensional input space. The DAME framework is a learnable, saliency-based explainability model, which is post-hoc, model-agnostic, and requires only query access to the black box. Extensive evaluations including quantitative, qualitative and subjective measures, presented on diverse object and sound classification tasks, demonstrate that the DAME approach provides improved explanation compared to other XAI methods.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8912
Loading