Gradient-based Counterfactual Explanations using Tractable Probabilistic ModelsDownload PDF


Sep 29, 2021 (edited Oct 05, 2021)ICLR 2022 Conference Blind SubmissionReaders: Everyone
  • Keywords: Counterfactual example, Sum product networks, tractable probabilistic models, counterfactual explanation.
  • Abstract: Counterfactual examples are an appealing class of post-hoc explanations for machine learning models. Given input x of class y, its counterfactual is a contrastive example x' of another class y'. Current approaches primarily solve this task by a complex optimization: define an objective function based on the loss of the counterfactual outcome y' with hard or soft constraints, then optimize this function as a black-box. This “deep learning” approach, however, is rather slow, sometimes tricky, and may result in unrealistic counterfactual examples. In this work, we propose a novel approach to deal with these problems using only two gradient computations based on tractable probabilistic models. First, we compute an unconstrained counterfactual u of x to induce the counterfactual outcome y'. Then, we adapt u to higher density regions, resulting in x'. Empirical evidence demonstrates the dominant advantages of our approach.
  • One-sentence Summary: Generating counterfactual examples using tractable probabilistic models.
0 Replies