The Steady-State Control Problem for Markov Decision Processes

S. Akshay, Nathalie Bertrand, Serge Haddad, Loïc Hélouët

Published: 2013, Last Modified: 30 Sept 2024QEST 2013EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This paper addresses a control problem for probabilistic models in the setting of Markov decision processes (MDP). We are interested in the steady-state control problem which asks, given an ergodic MDP \(\mathcal{M}\) and a distribution δ goal, whether there exists a (history-dependent randomized) policy π ensuring that the steady-state distribution of \(\mathcal{M}\) under π is exactly δ goal. We first show that stationary randomized policies suffice to achieve a given steady-state distribution. Then we infer that the steady-state control problem is decidable for MDP, and can be represented as a linear program which is solvable in PTIME. This decidability result extends to labeled MDP (LMDP) where the objective is a steady-state distribution on labels carried by the states, and we provide a PSPACE algorithm. We also show that a related steady-state language inclusion problem is decidable in EXPTIME for LMDP. Finally, we prove that if we consider MDP under partial observation (POMDP), the steady-state control problem becomes undecidable.