Unifying Explainable Agency Via a Ladder of Intentions

Victor Gimenez-Abalos, Adrián Tormos, Javier Vazquez-Salceda, FIlip Edström, Mattias Brännström, John Lindqvist, Ulises Cortés, Sergio Alvarez-Napagao

Published: 18 Mar 2026, Last Modified: 06 May 2026OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Explainable Agency (XAg) is a subfield of Explainable AI (XAI) dedicated to explaining agents that perceive, reflect, reason, and act on an environment. However, the process of decision that brings about action is very heterogeneous across different architectures, and thus results in similarly heterogeneous ways of producing explanations. This becomes an issue in the absence of standards for evaluating explanations, both whether they are understandable by humans, and also whether they are trustworthy: the paradigm of agent explainability is riddled with different definitions on what are good metrics for evaluating explanations of an agent of a particular architecture or agent paradigm. Furthermore, when those metrics require of user studies, much work needs to be redone when applied to a different agent architecture, cohort of explainees, or context. Within this work, we propose a unifying way of examining agent behaviour, stratified into different abstraction levels (the rungs of a ladder), so as to be able to compare different agent architecture components. This architecture serves as an engineering pattern, decoupling the problem of providing trustworthy abstractions for giving explanations from the problem of formulating those explanations in an understandable manner. This paper is a first step in bringing about a world where architectures provide – and evaluate their trustworthiness on – abstractions that have been tested and evaluated by user studies, as well as enabling explanation providers to innovate on methods for formulating explanations from abstractions without needing expert knowledge on every agent architecture.