CoLa-DCE – Concept-guided Latent Diffusion Counterfactual Explanations

Franz Motzkus; Christian Hellert; Ute Schmid

CoLa-DCE – Concept-guided Latent Diffusion Counterfactual Explanations

Franz Motzkus, Christian Hellert, Ute Schmid

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Counterfactual Explanations, Concept-based Explanations, Diffusion-based Counterfactuals, Counterfactual Image Generation

TL;DR: We introduce CoLa-DCE, generating image counterfactual explanations with concept guidance for better comprehensible, fewer feature changes and concept-based control..

Abstract: Recent advancements in generative AI have introduced novel prospects and prac- tical implementations. Especially diffusion models show their strength in gener- ating diverse and, at the same time, realistic features, positioning them well for generating counterfactual explanations for computer vision models. Answering “what if” questions of what needs to change to make an image classifier change its prediction, counterfactual explanations align well with human understanding and consequently help in making model behavior more comprehensible. Current methods succeed in generating authentic counterfactuals, but lack transparency as feature changes are not directly perceivable. To address this limitation, we intro- duce Concept-guided Latent Diffusion Counterfactual Explanations (CoLa-DCE). CoLa-DCE generates concept-guided counterfactuals for any classifier with a high degree of control regarding concept selection and spatial conditioning. The coun- terfactuals comprise an increased granularity through minimal feature changes. The reference feature visualization ensures better comprehensibility, while the feature localization provides increased transparency of “where” changed “what”. We demonstrate the advantages of our approach in minimality and comprehen- sibility across multiple image classification models and datasets and provide in- sights into how our CoLa-DCE explanations help comprehend model errors like misclassification cases.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10478

Loading