Explaining ViTs Using Information Flow

Published: 22 Jan 2025, Last Modified: 06 Mar 2025AISTATS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Computer vision models can be explained by attributing the output decision to the input pixels. While effective methods for explaining convolutional neural networks have been proposed, these methods often produce low-quality attributions when applied to vision transformers (ViTs). State-of-the-art methods for explaining ViTs capture the flow of patch information using transition matrices. However, we observe that transition matrices alone are not sufficiently expressive to accurately explain ViT models. In this paper, we define a theoretical approach to creating explanations for ViTs called InFlow. The framework models the patch-to-patch information flow using a combination of transition matrices and patch embeddings. Moreover, we define an algebra for updating the transition matrices of series connected components, diverging paths, and converging paths in the ViT model. This algebra allows the InFlow framework to produce high quality attributions which explain ViT decision making. In experimental evaluation on ImageNet, with three models, InFlow outperforms six ViT attribution methods in the standard insertion, deletion, SIC and AIC metrics by up to 18%. Qualitative results demonstrate InFlow produces more relevant and sharper explanations. Code is publicly available at https://github.com/chasewalker26/InFlow-ViT-Explanation.
Submission Number: 836
Loading