Consistent Counterfactual Explanations via Anomaly Control and Data Coherence

Published: 01 Jan 2025, Last Modified: 20 May 2025IEEE Trans. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Algorithmic recourses are popular methods to provide individuals impacted by machine learning models with recommendations on feasible actions for a more favorable prediction. Most of the previous algorithmic recourse methods work under the assumption that the predictive model does not change over time. However, in reality, models in deployment may both be periodically retrained and have their architecture changed. Therefore, it is desirable that the recourse should remain valid when such a model update occurs, unless new evidence arises. We call this feature consistency. This article presents anomaly control and data coherence (ACDC), a novel model-agnostic recourse method that generates counterfactual explanations, i.e., instance-level recourses. ACDC is inspired by anomaly detection methods and uses a one-class classifier to aid the search for valid, consistent, and feasible counterfactual explanations. The one-class classifier asserts that the generated counterfactual explanations lie on the data manifold and are not outliers of the target class. We compare ACDC against several state-of-the-art recourse methods across four datasets. Our experiments show that ACDC outperforms baselines both in generating consistent counterfactual explanations, and in generating feasible and plausible counterfactual explanations, while still having proximity measures similar to the baseline methods targeting the data manifold.
Loading