TL;DR: TN-SHAP-G computes Shapley values and interactions for graph models by learning a graph-aligned tensor-network surrogate, enabling deterministic and query-efficient attributions that scale beyond sampling-based methods.
Abstract: Shapley values are a widely used tool for attributing importance and interactions among input variables in black-box models, but their computation involves a function defined over an exponentially large space of subsets. We propose TN-SHAP-G, a framework that exploits structure in graph-structured inputs to compute Shapley values and higher-order interaction indices efficiently. Given a predictor and a fixed masking scheme, TN-SHAP-G learns a compact, graph-aligned multilinear surrogate that approximates the masked-input behavior, represented as a tensor network whose topology mirrors the input graph. Once trained from a small number of oracle queries, the surrogate enables deterministic recovery of first- and higher-order Shapley indices via the multilinear extension, without additional model queries or Monte Carlo variance. Experiments on molecular benchmarks show that the learned factorization closely matches exact Shapley values on small graphs and scales efficiently to larger graphs where sampling-based methods become infeasible.
Lay Summary: Modern AI models are often used on data that naturally forms a graph, for example, molecules, where atoms are nodes and chemical bonds are connections. When a model predicts that a molecule is toxic or useful as a drug, scientists want to understand why: which atoms mattered most, and which groups of atoms interacted to drive the prediction.
A common way to assign this “credit” comes from game theory and is called the Shapley value. The problem is that computing Shapley values requires considering every possible combination of atoms, and the number of combinations grows exponentially with molecule size. Existing methods either make thousands of expensive model calls or rely on noisy approximations.
In this paper, we introduce TN-SHAP-G, a new method that learns a small structured surrogate of the original AI system. Instead of treating the molecule as unstructured data, the surrogate mirrors the molecule’s graph structure itself. This lets it efficiently capture how nearby atoms influence one another while using far fewer model queries.
Once this surrogate model is trained, we can compute exact importance scores, including interactions between atoms, directly and deterministically, without repeatedly querying the original model. On molecular benchmarks, TN-SHAP-G matches exact Shapley values on small graphs while scaling to much larger molecules where previous methods become too slow or memory-intensive.
More broadly, the method makes AI explanations faster, cheaper, and more reliable for scientific applications like drug discovery, where understanding a model’s reasoning is critical.
Primary Area: Theory->Game Theory
Keywords: GNNs, Shapley, Tensor Networks, Surrogate models, Cooperative game theory, Explainability
Originally Submitted PDF: pdf
Submission Number: 30173
Loading