Regular Decision Processes for Grid Worlds
Abstract: Markov decision processes are typically used for sequential
decision making under uncertainty. For many aspects however, ranging
from constrained or safe specifications to various kinds of temporal (non-
Markovian) dependencies in task and reward structures, extensions are
needed. To that end, in recent years interest has grown into combinations
of reinforcement learning and temporal logic, that is, combinations
of flexible behavior learning methods with robust verification and guarantees.
In this paper we describe an experimental investigation of the
recently introduced regular decision processes that support both non-
Markovian reward functions as well as transition functions. In particular,
we provide a tool chain for regular decision processes, algorithmic
extensions relating to online, incremental learning, an empirical evaluation
of model-free and model-based solution algorithms, and applications
in regular, but non-Markovian, grid worlds.
0 Replies
Loading