Axiomatic Foundations of Counterfactual Explanations

Published: 19 Dec 2025, Last Modified: 05 Jan 2026AAMAS 2026 FullEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Explainable AI, Explainable Agency, Foundations of Explainability, Counterfactual Explanations
Abstract: Explaining autonomous and intelligent systems is critical in order to improve trust in their decisions. \textit{Counterfactuals} have emerged as one of the most compelling forms of explanation. They address “why not” questions by revealing how decisions could be altered. Despite the growing literature, most existing explainers focus on a single \textit{type} of counterfactual and are restricted to \textit{local} explanations, focusing on individual instances. There has been no systematic study of alternative counterfactual types, nor of \textit{global} counterfactuals that shed light on a system’s overall reasoning process. This paper addresses the two gaps by introducing an axiomatic framework built on a set of desirable properties for counterfactual explainers. It proves impossibility theorems showing that no single explainer can satisfy certain axiom combinations simultaneously, and fully characterizes all compatible sets. Representation theorems then establish five one-to-one correspondences between specific subsets of axioms and the families of explainers that satisfy them. Each family gives rise to a distinct type of counterfactual explanation, uncovering five fundamentally different types of counterfactuals. Some of these correspond to local explanations, while others capture global explanations. Finally, the framework situates existing explainers within this taxonomy, formally characterizes their behavior, and analyzes the computational complexity of generating such explanations.
Area: Representation and Reasoning (RR)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 176
Loading