SmoothHess: ReLU Network Feature Interactions via Stein's Lemma

Max Torop; Aria Masoomi; Davin Hill; Kivanc Kose; Stratis Ioannidis; Jennifer Dy

SmoothHess: ReLU Network Feature Interactions via Stein's Lemma

Max Torop, Aria Masoomi, Davin Hill, Kivanc Kose, Stratis Ioannidis, Jennifer Dy

Published: 21 Sept 2023, Last Modified: 14 Jan 2024NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: Interpretability, Feature Interactions, Stein's Lemma

TL;DR: We use Stein's Lemma to compute a smoothed Hessian for interpreting feature interactions in ReLU networks.

Abstract: Several recent methods for interpretability model feature interactions by looking at the Hessian of a neural network. This poses a challenge for ReLU networks, which are piecewise-linear and thus have a zero Hessian almost everywhere. We propose SmoothHess, a method of estimating second-order interactions through Stein's Lemma. In particular, we estimate the Hessian of the network convolved with a Gaussian through an efficient sampling algorithm, requiring only network gradient calls. SmoothHess is applied post-hoc, requires no modifications to the ReLU network architecture, and the extent of smoothing can be controlled explicitly. We provide a non-asymptotic bound on the sample complexity of our estimation procedure. We validate the superior ability of SmoothHess to capture interactions on benchmark datasets and a real-world medical spirometry dataset.

Supplementary Material: zip

Submission Number: 12471

Loading