[Re] Improving Interpretation Faithfulness for Vision Transformers

Izabela Kurek; Wojciech Trejter; Stipe Frkovic; Andro Erdelez

[Re] Improving Interpretation Faithfulness for Vision Transformers

Izabela Kurek, Wojciech Trejter, Stipe Frkovic, Andro Erdelez

Published: 07 Jun 2025, Last Modified: 07 Jun 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: This work aims to reproduce the results of Faithful Vision Transformers (FViTs) proposed by Hu et al. (2024) alongside interpretability methods for Vision Transformers from Chefer et al. (2021) and Xu et al. (2022). We investigate claims made by Hu et al. (2024), namely that the usage of Diffusion Denoised Smoothing (DDS) improves interpretability robustness to (1) attacks in a segmentation task and (2) perturbation and attacks in a classification task. We also extend the original study by investigating the authors’ claims that adding DDS to any interpretability method can improve its robustness under attack. This is tested on baseline methods and the recently proposed Attribution Rollout method. In addition, we measure the computational costs and environmental impact of obtaining an FViT through DDS. Our results broadly agree with the original study’s findings, although minor discrepancies were found and discussed.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: Updated the author, code and publication information for camera-ready version.

Code: https://github.com/aerdelez/re-fvit

Assigned Action Editor: ~Shiyu_Chang2

Submission Number: 4282

Loading