Counterfactual generation for Out-of-Distribution data

Published: 05 Nov 2025, Last Modified: 05 Nov 2025NLDL 2026 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Counterfactual Explanations, Out-of-distribution, Explanability
TL;DR: Two-stage approach to explain why a model considers a data point as out-of-distribution.
Abstract: Deploying machine learning models in safety-critical applications necessitates both reliable out-of-distribution (OOD) detection and interpretable model behavior. While substantial progress has been made in OOD detection and explainable AI (XAI), the question of why a model classifies a data point as OOD remains underexplored. Counterfactual explanations are a widely used XAI approach, yet they often fail in OOD contexts, as the generated examples may themselves be OOD. To address this limitation, we introduce the concept of OOD counterfactuals—perturbed inputs that transition between distinct OOD categories—to provide insight into the model’s OOD classification decisions. We propose a novel method for generating OOD counterfactuals and evaluate it on synthetic, tabular, and image datasets. Empirical results demonstrate that our approach offers both quantitatively and qualitatively improved explanations compared to existing baselines.
Serve As Reviewer: ~Nawid_Keshtmand1
Submission Number: 45
Loading