Controlling Structured Explanations via Shapley Values

Controlling Structured Explanations via Shapley Values

ICLR 2026 Conference Submission12607 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Shapley values, structured explanations, SAG, subexplanations, ProtoTree

Abstract: Structured explanations elucidate complex feature interactions of deep networks, promoting interpretability and accountability. However, existing work primarily focuses on post hoc diagnostic analyses and does not address the fidelity of structured explanations during network training. In contrast, we adopt a Shapley value-based framework to analyze and regulate structured explanations during training. Our analysis shows that valid subexplanation counts in structured explanations of Transformers and CNNs strongly correlate with each model's feature interaction strength. We also adopt a Shapley value-based multi-order interaction regularizer and experimentally demonstrate on the large-scale ImageNet and fine-grained CUB-200 datasets that this regularization allows the model to actively control explanation scale and interpretability during training.

Primary Area: interpretability and explainable AI

Submission Number: 12607

Loading