Visual Object-Centric Counterfactual Explanations

18 Sept 2025 (modified: 23 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Interpretability, Explainability
TL;DR: We introduce an object‑centric diffusion method that generates visual counterfactual explanations by editing object‑level representations with gradient guidance and a GMM regularizer, outperforming prior approaches and generalizing across datasets
Abstract: Generating visually coherent and realistic counterfactual explanations is essential for understanding discriminative visual models. Existing methods often modify images at the pixel-level or within holistic latent spaces, leading to entangled changes that obscure the precise factors influencing model decisions. To address this, we introduce a novel object-centric method for visual counterfactual explanations. Our approach decomposes input images into distinct object-centric latent slots and leverages model's gradients to guide a reverse diffusion process conditioned on these slots. To maintain realism, we propose a Gaussian Mixture Model (GMM)-based regularizer that constrains counterfactuals to remain within the distribution of plausible object states, preventing unrealistic generations. Experiments on three datasets and a user study demonstrate that our object-centric approach yields significantly more interpretable and realistic counterfactuals compared to state-of-the-art baselines. Moreover, our approach shows strong generalization: when trained solely on FFHQ dataset, it successfully generates coherent counterfactual explanations on unseen CelebA-HQ data. Overall, our approach substantially advances visual counterfactual explanations by offering explicit object-level interpretability and improved quality of generation.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 10895
Loading