Unifying Perspectives: Plausible Counterfactual Explanations on Global, Group-wise, and Local Levels

ICLR 2026 Conference Submission19089 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Explainable AI, Counterfactual Explanations
TL;DR: This paper introduces a unified method for generating local global, and group-wise counterfactual explanations for differentiable classification models, using gradient-based optimization and a probabilistic plausibility criterion.
Abstract: The growing complexity of AI systems has intensified the need for transparency through Explainable AI (XAI). Counterfactual explanations (CFs) offer actionable "what-if" scenarios on three levels: Local CFs providing instance-specific insights, Global CFs addressing broader trends, and Group-wise CFs (GWCFs) striking a balance and revealing patterns within cohesive groups. Despite the availability of methods for each granularity level, the field lacks a unified method that integrates these complementary approaches. We address this limitation by proposing a gradient-based optimization method for differentiable models that generates Local, Global, and Group-wise Counterfactual Explanations in a unified manner. We especially enhance GWCF generation by combining instance grouping and counterfactual generation into a single efficient process, replacing traditional two-step methods. Moreover, to ensure trustworthiness, we pioneer the integration of plausibility criteria into the GWCF domain, making explanations both valid and realistic. Our results demonstrate the method's effectiveness in balancing validity, proximity, and plausibility while optimizing group granularity.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 19089
Loading