Unified Counterfactual Explanation Framework for Black-Box Models

Published: 01 Jan 2023, Last Modified: 07 Apr 2025PRICAI (3) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Despite large-scale deployment in industry and daily life scenarios, the black-box nature of Connectionism-based deep neural networks is still criticized. Counterfactual explanation can shed light on the inner mechanism of arbitrary deep-learning model, thus being a preferable local interpretation method. There are a variety of methods for counterfactual generation, however, exist two defects: (1) Disunity. There is no agreement on model architecture and optimization methods of counterfactual generation. (2) Neglect of desiderata. There exist several desiderata for a good counterfactual sample, but most existing works only include a few of them. To address the above problem, we propose UNICE, a unified framework for counterfactual generations. UNICE models the counterfactual generation as a multi-task optimization problem on a dense data manifold learn by auto-encoder. Besides, UNICE addresses counterfactual desiderata to the best of our knowledge. What’s more, one can custom UNICE components regarding specific tasks and data modalities. An UNICE implementation for tabular data is provided and surpasses state-of-the-art methods in five of six metrics, indicating the effectiveness of our proposed method.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview