Keywords: Interpretability, Explainability, Adversarial Manipulation, Evaluation, Reproducibility, Reliability, Faithfulness
TL;DR: Due to the lack of ground truth explanations, XAI evaluation methods can be adversarially manipulated which reduces trustworthiness.
Abstract: The absence of ground truth explanation labels poses a key challenge for quantitative evaluation in interpretable AI (IAI), particularly when evaluation methods involve numerous user-specified hyperparameters. Without ground truth, optimising hyperparameter selection is difficult, often leading researchers to make choices based on similar studies, which offers considerable flexibility. We show how this flexibility can be exploited to manipulate evaluation outcomes by framing it as an adversarial attack where minor hyperparameter adjustments lead to significant changes in results. Our experiments demonstrate substantial variations in evaluation outcomes across multiple datasets, explanation methods, and models. To counteract this, we propose a ranking-based mitigation strategy that enhances robustness against such manipulations. This work underscores the challenges of reliable evaluation in IAI. Code is available at \url{https://github.com/Wickstrom/quantitative-IAI-manipulation}.
Track: Published paper track
Submitted Paper: No
Published Paper: Yes
Published Venue: eXCV Workshop at ECCV 2024 (Proceedings Track)
Submission Number: 78
Loading