One Wave To Explain Them All: A Unifying Perspective On Feature Attribution

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We generalize gradient-based explainability to the wavelet domain; expanding it to new modalities beyond images and show that it yields novel insights on model understanding.
Abstract: Feature attribution methods aim to improve the transparency of deep neural networks by identifying the input features that influence a model's decision. Pixel-based heatmaps have become the standard for attributing features to high-dimensional inputs, such as images, audio representations, and volumes. While intuitive and convenient, these pixel-based attributions fail to capture the underlying structure of the data. Moreover, the choice of domain for computing attributions has often been overlooked. This work demonstrates that the wavelet domain allows for informative and meaningful attributions. It handles any input dimension and offers a unified approach to feature attribution. Our method, the **W**avelet **A**ttribution **M**ethod (WAM), leverages the spatial and scale-localized properties of wavelet coefficients to provide explanations that capture both the *where* and *what* of a model's decision-making process. We show that WAM quantitatively matches or outperforms existing gradient-based methods across multiple modalities, including audio, images, and volumes. Additionally, we discuss how WAM bridges attribution with broader aspects of model robustness and transparency. Project page: https://gabrielkasmi.github.io/wam/
Lay Summary: Modern machine learning models, especially deep neural networks, are very accurate but lack transparency, which can be a problem in critical applications. To overcome this limitations, feature attribution methods have been introduced to highlight which individual input variables contribute the most to the model's decision. When dealing with input data such as images, feature attribution method usually consist in computing a heatmap, which highlights which pixels or image regions contributed the most to the model's decision. However, these "pixel-based" methods do not inform cannot indicate whether it is the shape or the texture of the object that mattered for the model's prediction and the pixel space is only convenient for representing images. Our work argues that attributing the model's decision to the wavelet decompisition of the input signal provides more informative and more generalizable model explanations. Wavelets are a mathematical tool that decomposes an input into different scales, allowing to isolate textures or edges on images, or bursts of sound on waveforms. We demonstrate that our approach, the Wavelet Attribution Method provides more meaningful explanations than existing techniques. It also opens up new ways to explore how reliable and robust a model’s decisions are, contributing to AI transparency.
Link To Code: https://github.com/gabrielkasmi/wam
Primary Area: Social Aspects->Accountability, Transparency, and Interpretability
Keywords: interpretability, feature attribution, wavelet, images, audio, 3D shapes
Submission Number: 6713
Loading