Hessian Sets: Uncovering Feature Interactions in Image Classification

Hessian Sets: Uncovering Feature Interactions in Image Classification

NeurIPS 2024 Workshop ATTRIB Submission52 Authors

Published: 30 Oct 2024, Last Modified: 14 Jan 2025ATTRIB 2024EveryoneRevisionsBibTeXCC BY 4.0

Release Opt Out: No, I don't wish to opt out of paper release. My paper should be released.

Keywords: explainable AI, feature interaction, feature attribution

Abstract: Feature attribution methods explain model predictions by computing the contribution of individual features. However, these methods often overlook the impact of feature interactions, which play a crucial role in tasks like image classification. In this work, we introduce Hessian Sets, a technique that leverages the Hessian matrix to detect and attribute pairwise feature interactions in image classifiers. We adapt Integrated Directional Gradients (IDG) to assign importance to these feature interaction sets. By integrating segmentation masks from the Segment Anything Model (SAM), we provide more interpretable and concise explanations. Our initial experiments on the Imagenette dataset demonstrate that our method produces sparse, interpretable feature attributions while effectively capturing important interactions. This is a work in progress, and we present preliminary results to highlight the potential of our approach for improving explainability in image classifiers.

Submission Number: 52

Loading