Path-Constrained Markov Decision Processes: bridging the gap between probabilistic model-checking and decision-theoretic planning

Florent Teichteil-Königsbuch

Published: 2012, Last Modified: 31 Oct 2024ECAI 2012EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Markov Decision Processes (MDPs) are a popular model for planning under probabilistic uncertainties. The solution of an MDP is a policy represented as a controlled Markov chain, whose complex properties on execution paths can be automatically validated using stochastic model-checking techniques. In this paper, we propose a new theoretical model, named Path-Constrained Markov Decision Processes: it allows system designers to directly optimize safe policies in a single design pass, whose possible executions are guaranteed to satisfy some probabilistic constraints on their paths, expressed in Probabilistic Real Time Computation Tree Logic. We mathematically analyze properties of PC-MDPs and provide an iterative linear programming algorithm for solving them. We also present experiments that illustrate PC-MDPs and highlight their benefits.