Learning Compact Regular Decision Processes using Priors and Cascades

Ahana Deb; Anders Jonsson; Alessandro Ronca; Mohammad Sadegh Talebi

Learning Compact Regular Decision Processes using Priors and Cascades

Ahana Deb, Anders Jonsson, Alessandro Ronca, Mohammad Sadegh Talebi

Published: 17 Jul 2025, Last Modified: 07 Oct 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Offline Reinforcement Learning, Regular Decision Process, Automata Learning

TL;DR: We study offline RL for Regular Decision Processes (RDPs) and introduce the notion of priors for automaton learning and develop a new algorithm for learning more compact RDPs.

Abstract: In this work we study offline Reinforcement Learning (RL), and extend the previous work on learning Regular Decision Processes (RDPs), which are a class of non-Markovian environment, where the unknown dependency of future observations and rewards from the past interactions can be captured by some hidden finite-state automaton. We utilise the language metric introduced previously for an offline RL algorithm for RDPs, and introduce a novel algorithm to learn a significantly more compact RDP with cycles, which are crucial for scaling to larger, more complex environments. Key to our results is a novel notion of priors for automaton learning, that allows us to exploit prior domain-related knowledge, used to factor out of the state space any feature that is known a priori. We validate our approach experimentally and provide a Probably Approximately Correct (PAC) analysis of our algorithm, showing it enjoys a sample complexity polynomial in the relevant parameters.

Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.

Serve As Reviewer: ~Ahana_Deb1

Track: Regular Track: unpublished work

Submission Number: 158

Loading