Counterfactual Explanations on Robust Perceptual Geodesics

ICLR 2026 Conference Submission16075 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Interpretability, Visual Counterfactual Explanations, Explainability
TL;DR: We propose Perceptual Counterfactual Geodesics (PCG), a method that generates counterfactuals by tracing geodesics in a perceptually aligned latent space, outperforming prior methods and avoiding failures from misaligned geometry
Abstract: Latent-space optimization methods for counterfactual explanations—framed as minimal semantic perturbations that change model predictions—inherit the ambiguity of Wachter et al.’s objective: the choice of distance metric dictates whether perturbations are meaningful or adversarial. Existing approaches adopt flat or misaligned geometries, leading to off-manifold artifacts, semantic drift, or adversarial collapse. We introduce Perceptual Counterfactual Geodesics (PCG), a method that constructs counterfactuals by tracing geodesics under a perceptually Riemannian metric induced from robust vision features. This geometry aligns with human perception and penalizes brittle directions, enabling smooth, on-manifold, semantically valid transitions. Experiments on three vision datasets show that PCG outperforms baselines and reveals failure modes hidden under standard metrics.
Primary Area: interpretability and explainable AI
Submission Number: 16075
Loading