[Re] Improving Interpretation Faithfulness for Vision Transformers

TMLR Paper4282 Authors

21 Feb 2025 (modified: 07 Mar 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: This work aims to reproduce the results of Faithful Vision Transformers (FViTs) proposed by Hu et al. (2024) alongside interpretability methods for Vision Transformers from Chefer et al. (2021) and Xu et al. (2022). We investigate claims made by Hu et al. (2024), namely that the usage of Diffusion Denoised Smoothing improves interpretability robustness (1) to attack in a segmentation task and (2) to perturbation in a classification task. We also extend the original study by investigating the authors’ claims that adding DDS to any method can improve its robustness under attack. This is tested on baseline interpretability algorithms and the recently proposed Attribution Rollout method. In addition, we measure the computational costs and environmental impact of obtaining an FViT through DDS. Our results agree broadly with the original study’s findings, although minor discrepancies were found and discussed.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: Shiyu Chang
Submission Number: 4282
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview