Keywords: explainable ai, interpretability, feature attribution, saliency, automatic differentiation
TL;DR: We propose SmoothDiff, a novel feature attribution method leveraging automatic differentiation, directly targeting nonlinearities responsible for the shattered gradient problem.
Abstract: Explaining complex machine learning models is a fundamental challenge when developing safe and trustworthy deep learning applications. To date, a broad selection of explainable AI (XAI) algorithms exist. One popular choice is SmoothGrad, which has been conceived to alleviate the well-known shattered gradient problem by smoothing gradients through convolution. SmoothGrad proposes to solve this high-dimensional convolution integral by sampling -- typically approximating the convolution with limited precision. Higher numbers of samples would amount to higher precision in approximating the convolution but also to higher computing demand, therefore in practice only few samples are used in SmoothGrad. In this work we propose a well founded novel method _SmoothDiff_ to resolve this tradeoff yielding a _speedup of over two orders of magnitude_. Specifically, _SmoothDiff_ leverages automatic differentiation to decompose the expected values of Jacobians across a network architecture, directly targeting only the non-linearities responsible for shattered gradients and making it easy to implement. We demonstrate SmoothDiff's excellent speed and performance in a number of experiments and benchmarks. Thus, SmoothDiff greatly enhances the usability (quality and speed) of SmoothGrad -- a popular workhorse of XAI.
Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
Submission Number: 23496
Loading