Keywords: Neurosymbolic AI, Symbolic Reinforcement Learning, Game Theory, MCTS, Automata Learning
TL;DR: We build a verifiably correct symbolic reinforcement learning pipeline for agents in games that leverages automata learning and model checking.
Abstract: We introduce \textit{PALS} (Preference-guided Active automata Learning
for Symbolic reinforcement learning), an active automata learning
framework that learns fully-symbolic policies for goal-directed games
from a preference oracle and LTL safety specifications. \textit{PALS}
extends classical L$^*$ by allowing both the hypothesis and the
preference oracle to evolve as queries accumulate, with an MCTS-driven
audit stage that surfaces deviations preferred over the current
hypothesis and a shielding layer that patches the oracle whenever the
hypothesis violates the safety specification. We demonstrate the utility
of \textit{PALS} on the Taxi Driver game from the Gymnasium benchmark,
evaluate it against standard Q-learning and MCTS baselines on a suite
of game-theoretic benchmarks, and provide a proof sketch establishing
optimality under modest assumptions on the game structure. To the best
of our knowledge, \textit{PALS} is the first algorithm that fully
symbolically learns reinforcement-learning policies for agents in games
via automata learning.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Paper Type: Standard paper
Submission Number: 64
Loading