Sum-of-Parts Models: Faithful Attributions for Groups of Features

Published: 27 Oct 2023, Last Modified: 10 Nov 2023NeurIPS XAIA 2023EveryoneRevisionsBibTeX
TL;DR: We develop Sum-of-Parts (SOP), a class of models whose predictions come with grouped feature attributions that are faithful-by-construction.
Abstract: An explanation of a machine learning model is considered "faithful" if it accurately reflects the model's decision-making process. However, explanations such as feature attributions for deep learning are not guaranteed to be faithful, and can produce potentially misleading interpretations. In this work, we develop Sum-of-Parts (SOP), a class of models whose predictions come with grouped feature attributions that are faithful-by-construction. This model decomposes a prediction into an interpretable sum of scores, each of which is directly attributable to a sparse group of features. We evaluate SOP on benchmarks with standard interpretability metrics, and in a case study, we use the faithful explanations from SOP to help astrophysicists discover new knowledge about galaxy formation.
Submission Track: Full Paper Track
Application Domain: Natural Science
Survey Question 1: We develop Sum-of-Parts (SOP), a class of models which uses a group generator to create groups of features, and a group selector to generate a score for each group. The scores are exactly how much each group contributes to the prediction, and thus making the attributions faithful to the model prediction. We use faithful grouped attributions of SOP from weak lensing maps and uncover novel insights about galaxy formation meaningful to cosmologists.
Survey Question 2: The deletion and insertion errors measure how well the total attribution from features in the explanation aligns with the change in model prediction when removing or adding the same features from the whole input features or to a blank input. Even for simple boolean monomials and binomials, the total deletion and insertion error grow exponentially with respect to the dimension. Grouped attributions are able to overcome exponentially growing insertion and deletion errors when the features interact with each other.
Survey Question 3: We develop and use Sum-of-Parts (SOP), a class of models with group-sparse feature attributions that are faithful by construction and are compatible with any backbone architecture.
Submission Number: 72
Loading