CUBE: Causal Intervention-Based Counterfactual Explanation for Prediction Models

Xinyue Shao, Hongzhi Wang, Xiang Chen, Xiao Zhu, Yan Zhang

Published: 01 Jan 2024, Last Modified: 24 Feb 2025IEEE Trans. Knowl. Data Eng. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recent several years have witnessed the rapid explosion of artificial intelligence applied in various domains with the surpassing human-level performance. Despite the success, these models’ underlying mechanisms remain a mystery, as their complicated representations make human understanding impossible. This mystery may cause discrimination and non-robustness in prediction. Making deep learning models more transparent and understandable is gaining popularity, but most of interpretation approaches provide spurious correlations leading to suboptimal, incorrect or even biased interpretations, which could be reduced by causal explanations. Motivated by this, we attempt to study the generation of causal explanations and propose CUBE, a causal intervention-based counterfactual interpretation method. To ensure that the generation process of counterfactual explanation conforms to causality, we model the counterfactual generation process as a causal graph and construct a counterfactual generation model based on the causal intervention; to generate counterfactuals that adhere to the causality, we introduce a causal director to capture the causal relationships in the distribution and guide the generation of counterfactuals; to improve the efficiency of the counterfactual generation when facing a large number of explanation queries, we model it as a sample generation problem and propose an explainable framework based on adversarial generation. The experimental results validate that CUBE outperforms other approaches in terms of both lower time costs and higher explanation quality.