Causal Shapley Values via Bayesian Structure Learning: Provably Faithful Explanations for Black-Box Models
Keywords: Explainable AI, Shapley values, causal inference, Bayesian structure learning, DAG, do-calculus, KernelSHAP, TreeSHAP, feature attribution, MIMIC-III
TL;DR: CausalSHAP uses learned causal DAGs and do-interventions for Shapley values; 35--52\% lower explanation error and 89\% correct suppressor identification.
Abstract: Standard Shapley values for feature attribution assume feature independence, producing misleading explanations when features are causally related. We introduce \textsc{CausalSHAP}, which integrates Bayesian causal structure learning with Shapley value computation to produce provably faithful explanations that respect the data-generating process. Our framework places a DAG prior over causal structures, learns the posterior via variational inference, and computes Shapley values by intervening (do-calculus) rather than conditioning. Our theoretical contributions are: (1) a \emph{Faithfulness Theorem} proving \textsc{CausalSHAP} attributions converge to true causal effects at rate $O(n^{-1/3})$ in total variation distance; (2) an \emph{Efficiency Theorem} showing \textsc{CausalSHAP} can be computed in $O(2^p \cdot \text{poly}(n))$ time where $p$ is the number of features, matching standard SHAP complexity; (3) a \emph{Consistency Theorem} guaranteeing convergence to ground-truth causal Shapley values as the DAG posterior concentrates. On synthetic causal benchmarks, medical diagnosis (MIMIC-III), and credit scoring (German Credit), \textsc{CausalSHAP} reduces explanation error by 35--52\% compared to KernelSHAP and TreeSHAP, and correctly identifies suppressor variables in 89\% of cases that standard methods miss.
Submission Number: 154
Loading