Analyzing the Effects of Classifier Lipschitzness on Explainers

Zulqarnain Khan; Aria Masoomi; Davin Hill; Jennifer Dy

Analyzing the Effects of Classifier Lipschitzness on Explainers

Zulqarnain Khan, Aria Masoomi, Davin Hill, Jennifer Dy

Published: 01 Feb 2023, Last Modified: 26 May 2025Submitted to ICLR 2023Readers: Everyone

Keywords: Explainers, Explanation, Robustness, Astuteness, Lipschitz, Blackbox, Classifiers

TL;DR: Theoretical work in support of the intuition that robust classifiers lend themselves to robust explainers

Abstract: Machine learning methods are getting increasingly better at making predictions, but at the same time they are also becoming more complicated and less transparent. As a result, explainers are often relied on to provide interpretability to these \textit{black-box} prediction models. As crucial diagnostics tools, it is important that these explainers themselves are reliable. In this paper we focus on one particular aspect of reliability, namely that an explainer should give similar explanations for similar data inputs. We formalize this notion by introducing and defining \textit{explainer astuteness}, analogous to astuteness of classifiers. Our formalism is inspired by the concept of \textit{probabilistic Lipschitzness}, which captures the probability of local smoothness of a function. For a variety of explainers (e.g., SHAP, RISE, CXPlain), we provide lower bound guarantees on the astuteness of these explainers given the Lipschitzness of the prediction function. These theoretical results imply that locally smooth prediction functions lend themselves to locally robust explanations. We evaluate these results empirically on simulated as well as real datasets.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/analyzing-the-effects-of-classifier/code)

8 Replies

Loading