MAGNEx: A Model Agnostic Global Neural ExplainerDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: Explainability, Neural Explainer, Faithfullness, Global, Post-hoc
Abstract: Black-box decision models have been widely adopted both in industry and academia due to their excellent performance across many challenging tasks and domains. However, much criticism has been raised around modern AI systems, to a large extent due to their inability to produce explainable decisions that both their end-users and their developers can trust. The need for such decisions, i.e., decisions accompanied by a rationale for why they are made, has ignited much recent research. We propose MAGNEx, a global algorithm that leverages neural-network based explainers to produce rationales for any black-box decision model, neural or not. MAGNEx is model-agnostic, and thus easily generalizable across domains and applications. More importantly, MAGNEx is global, i.e., it learns to create rationales by optimizing for a number of instances at once, contrary to local methods that aim at explaining a single example. The global nature of MAGNEx has two advantages over local methods: i) it generalizes across instances hence producing more faithful explanations, ii) it is computationally more efficient during inference. Our experiments confirm that MAGNEx outperforms popular explainability algorithms both in explanation quality and in computational efficiency.
One-sentence Summary: A global post-hoc method for black-box model explainability, utilizing neural-based explainers.
10 Replies