Cost-aware counterfactuals for black box explanations

Published: 27 Oct 2023, Last Modified: 23 Nov 2023NeurIPS XAIA 2023EveryoneRevisionsBibTeX
Abstract: Counterfactual explanations provide actionable insights into the minimal change in a system that would lead to a more desirable prediction from a black box model. We address the challenges of finding valid and low cost counterfactuals in the setting where there is a different cost or preference for perturbing each feature. We propose a multiplicative weight approach that is applied on the perturbation, and show that this simple approach can be easily adapted to obtain multiple diverse counterfactuals, as well as to integrate the importance features obtained by other state of the art explainers to provide counterfactual examples. Additionally, we discuss the computation of valid counterfactuals with numerical gradient-based methods when the black box model presents flat regions with no reliable gradient. In this scenario, sampling approaches, as well as those that rely on available data, sometimes provide counterfactuals that may not be close to the decision boundary. We show that a simple long-range guidance approach, which consist of sampling from a larger radius sphere in search of a direction of change for the black box predictor when no gradient is available, improves the quality of the counterfactual explanation. In this work we discuss existing approaches, and show how our proposed alternatives compares favourably on different datasets and metrics.
Submission Track: Full Paper Track
Application Domain: Social Science
Survey Question 1: We tackle the problem of obtaining diverse, high quality counterfactual explanations of a high prediction performance model (e.g., credit lending) to improve the choices in recourse a user may have.
Survey Question 2: Complex models present high prediction performance when modeling a particular system but they lack mechanisms to provide recourse for decision making. Counterfactuals are a natural way of providing recourse to a user.
Survey Question 3: LIME, SHAP, Counterfactuals, Saliency Explainers.
Submission Number: 79