Analysis of High-order Interactions in Shapley value for Model Interpretation

Wonjoon Chang; Junho Choi; Myeongjin Lee; Minseok Yoon; Jaesik Choi

Analysis of High-order Interactions in Shapley value for Model Interpretation

Wonjoon Chang, Junho Choi, Myeongjin Lee, Minseok Yoon, Jaesik Choi

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: explainable AI, Shapley value, Interaction

Abstract: The Shapley value is a fundamental game-theoretic framework for allocating a utility function’s output among participating players, and is commonly interpreted as the expected marginal contribution under random coalitions. However, when applied to complex functions such as deep neural networks, this expected marginal contribution implicitly aggregates higher-order interaction effects, which can obscure the true contribution of features. In this study, we derive a generalized decomposition of the Shapley value that expresses it as a sum of interaction terms of arbitrary order, making explicit how higher-order interactions are incorporated within marginal contributions. We also provide an unbiased estimator for our representation via permutation sampling, enabling practical computation. We further show that when interaction effects vary substantially across contexts, these embedded higher-order terms can lead to misleading attributions for model interpretation. Our theoretical analysis and empirical evaluations demonstrate that variance in lower-order interactions reliably signals the presence of hidden higher-order structure, providing a principled criterion for when such interactions should be explored. This interaction-based perspective clarifies when the Shapley value becomes unreliable and offers new guidance for interpreting model behavior.

Primary Area: interpretability and explainable AI

Submission Number: 11357

Loading