Strong polynomiality of policy iterations for average-cost MDPs modeling replacement and maintenance problems

Eugene A. Feinberg, Jefferson Huang

Published: 2013, Last Modified: 01 Oct 2024Oper. Res. Lett. 2013EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This note considers an average-cost Markov Decision Process (MDP) with finite state and action sets and satisfying the additional condition that there is a state to which the system jumps from any state and under any action with a positive probability. The main result is that the policy iteration algorithm is strongly polynomial for such MDPs, which are often used to model replacement and maintenance problems.