On the Faithfulness of Vision Transformer Explanations

Published: 01 Jan 2024, Last Modified: 11 Apr 2025CVPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: To interpret Vision Transformers, post-hoc explanations assign salience scores to input pixels, providing human-understandable heatmaps. However, whether these inter-pretations reflect true rationales behind the model's output is still underexplored. To address this gap, we study the faithfulness criterion of explanations: the assigned salience scores should represent the influence of the corresponding input pixels on the model's predictions. To evaluate faithful-ness, we introduce Salience-guided Faithfulness Coefficient (SaCo), a novel evaluation metric leveraging essential in-formation of salience distribution. Specifically, we con-duct pair-wise comparisons among distinct pixel groups and then aggregate the differences in their salience scores, resulting in a coefficient that indicates the explanation's degree of faithfulness. Our explorations reveal that cur-rent metrics struggle to differentiate between advanced ex-planation methods and Random Attribution, thereby failing to capture the faithfulness property. In contrast, our pro-posed SaCo offers a reliable faithfulness measurement, es-tablishing a robust metric for interpretations. Furthermore, our SaCo demonstrates that the use of gradient and multi-layer aggregation can markedly enhance the faithfulness of attention-based explanation, shedding light on potential paths for advancing Vision Transformer explainability.
Loading