Explainable Representation of Finite-Memory Policies for POMDPs using Decision Trees

Muqsit Azeem; Debraj Chakraborty; Sudeep Kanav; Jan Kretinsky

Explainable Representation of Finite-Memory Policies for POMDPs using Decision Trees

Muqsit Azeem, Debraj Chakraborty, Sudeep Kanav, Jan Kretinsky

Published: 19 Dec 2025, Last Modified: 05 Jan 2026AAMAS 2026 ExtendedAbstractEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Partial Observable Markov Decision Processes, Finite State Controller, Decision Tree

TL;DR: We provide an explainable representation of finite-memory policies for POMDPs via amalgamation of finite-state controllers with decision trees, yielding more compact, interpretable policies while preserving optimality.

Abstract: Partially Observable Markov Decision Processes (POMDPs) are a fundamental framework for decision-making under uncertainty but often require infinite memory, making implementation infeasible and many problems undecidable. While finite-memory policies provide a practical alternative, they remain complex and challenging to interpret. To address this, we propose a novel \emph{representation} of finite-memory policies that is both (i) interpretable and (ii) smaller, enhancing explainability without sacrificing optimality. To that end, we combine Mealy machines and decision trees (DTs); the latter describing simple, stationary parts of the policies and the former describing how to switch among them. We design a translation for finite-state-controller (FSC) policies from standard literature into our new representation, enhancing explainability and compactness while preserving optimality. Notably, our method seamlessly generalizes to other variants of finite-memory policies. Additionally, we identify unique properties of ``attractor-based'' policies, enabling the construction of even smaller, simpler representations. Finally, through multiple case studies, we illustrate the improved explainability and practicality of our approach.

Area: Search, Optimization, Planning, and Scheduling (SOPS)

Generative A I: I acknowledge that I have read and will follow this policy.

Submission Number: 1612

Loading