XAudit : A Learning-Theoretic Look at Auditing with Explanations

TMLR Paper2253 Authors

17 Feb 2024 (modified: 01 Mar 2024)Under review for TMLREveryoneRevisionsBibTeX
Abstract: Responsible use of machine learning requires models to be audited for undesirable properties. While a body of work has proposed using explanations for auditing, how to do so and why has remained relatively ill-understood. This work formalizes the role of explanations in auditing using inspirations from active learning and investigates if and how model explanations can help audits. As an instantiation of our framework, we propose explanation-based algorithms for auditing linear classifiers and decision trees for `feature sensitivity'. Our results illustrate that Counterfactual explanations are extremely helpful for auditing, even in the worst-case. While Anchor explanations and decision paths may not be as beneficial in the worst-case, in the average-case they do aid significantly.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Gintare_Karolina_Dziugaite1
Submission Number: 2253
Loading