Position: XAI needs formal notions of explanation correctness

Published: 10 Oct 2024, Last Modified: 03 Dec 2024IAI Workshop @ NeurIPS 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: explainable AI, XAI, formalization, correctness, validation, benchmarking
TL;DR: XAI currently lacks formal problem specifications, which prevents its validation and is the root cause of misinterpretations and its unfitness for various intended purposes.
Abstract: The use of machine learning (ML) in critical domains such as medicine poses risks and requires regulation. One requirement is that decisions of ML systems in high-risk applications should be human-understandable. The field of "explainable artificial intelligence" (XAI) seemingly addresses this need. However, in its current form, XAI is unfit to provide quality control for ML; it itself needs scrutiny. Popular XAI methods cannot reliably answer important questions about ML models, their training data, or a given test input. We recapitulate results demonstrating that popular feature attribution and counterfactual estimation methods systematically attribute importance to input features that are independent of the prediction target, and that popular faithfulness metrics incentivize attribution to such features. This limits their utility for purposes such as model and data (in)validation, model improvement, and scientific discovery. We argue that the fundamental reason for this limitation is that current XAI methods do not address well-defined problems and are not evaluated against objective criteria of explanation correctness. Researchers should formally define the problems they intend to solve first and then design methods accordingly. This will lead to notions of explanation correctness that can be theoretically verified and objective metrics of explanation performance that can be assessed using ground-truth data.
Track: Position paper track
Submitted Paper: No
Published Paper: No
Submission Number: 5
Loading