Keywords: causality, explainability
Abstract: Feature attribution methods aim to explain the predictions of machine learning models by assigning importance scores to input features. Recent work has highlighted the importance of developing attribution methods that respect causal structures. Furthermore, they showed that existing approaches can assign significant importance to variables outside the markov boundary, even though these variables provide no additional predictive information when the markov boundary is observed.
To address these limitations we design a new attribution method that accounts for both predictive power and causal structure of the features. Our method does not assume access to the structure and achieves balanced attributions using properly defined characteristic function.
We show that our method provably assigns high attributions to the variables in the markov boundary and experimentally evaluate it in a fairness inspired setting.
Submission Number: 29
Loading