Learning Compact Regular Decision Processes using Priors and Cascades

Published: 17 Jul 2025, Last Modified: 06 Sept 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Offline Reinforcement Learning, Regular Decision Process, Automata Learning
TL;DR: We study offline RL for Regular Decision Processes (RDPs) and introduce the notion of priors for automaton learning and develop a new algorithm for learning more compact RDPs.
Abstract: In this work we study offline Reinforcement Learning (RL), and extend the previous work on learning Regular Decision Processes (RDPs), which are a class of non-Markovian environment, where the unknown dependency of future observations and rewards from the past interactions can be captured by some hidden finite-state automaton. We utilise the language metric introduced previously for an offline RL algorithm for RDPs, and introduce a novel algorithm to learn a significantly more compact RDP with cycles, which are crucial for scaling to larger, more complex environments. Key to our results is a novel notion of priors for automaton learning, that allows us to exploit prior domain-related knowledge, used to factor out of the state space any feature that is known a priori. We validate our approach experimentally and provide a Probably Approximately Correct (PAC) analysis of our algorithm, showing it enjoys a sample complexity polynomial in the relevant parameters.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Ahana_Deb1
Track: Regular Track: unpublished work
Submission Number: 158
Loading