Improving ViT interpretability with patch-level mask prediction

Published: 01 Jan 2025, Last Modified: 15 May 2025Pattern Recognit. Lett. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose a novel visual explanation for ViT using a patch-level localization task.•Our method enhances explainability and localization performance across benchmarks.•Our method works with pseudo masks from self-supervised approaches.
Loading