Towards Quantifying Incompatibilities in Evaluation Metrics for Feature Attributions

Thomas Decker; Volker Tresp

Towards Quantifying Incompatibilities in Evaluation Metrics for Feature Attributions

Thomas Decker, Volker Tresp

Published: 11 Nov 2025, Last Modified: 23 Dec 2025XAI4Science Workshop 2026EveryoneRevisionsBibTeXCC BY 4.0

Track: Regular Track (Page limit: 6-8 pages)

Keywords: Feature Attributions, Explanation Evaluation

Abstract: Feature attribution methods are widely used to explain machine learning models, yet their evaluation is challenging due to competing quality criteria such as faithfulness, robustness, and sparsity. These criteria often conflict, and even alternative formulations of the same metric can yield inconsistent conclusions. We address this by introducing a unifying framework that analyzes systematic incompatibilities between measures of explanation quality. Within this framework, we develop two novel mathematical tools: a sample-wise incompatibility index that quantifies systematic conflicts between criteria, and a generalized eigen-analysis that localizes where tradeoffs are concentrated within attribution results. Experiments on image classifiers show that this analysis provides insights beyond isolated metrics and complements current evaluation practices for feature attributions.

Supplementary Material: pdf

Submission Number: 16

Loading