Abstract: In this paper we consider Markov decision processes with average cost criteria, and discuss an approach for characterizing the performance loss associated with using a suboptimal control policy. Because there are often difficulties associated with computing and implementing optimal control policies, heuristic control policies are often used in practice. For such a policy, we would like to be able to compute guaranteed bounds on its performance, specifically its performance relative to an optimal policy. In other words, our goal is to produce a systematic approach for evaluating how far a specific policy is from optimality. This approach is demonstrated on a simple queuing system with a single server and multiple job classes. We use the general methods developed in the first part of the paper to show that for any non-idling policy, suboptimality of the resulting average queue length is bounded by a factor which only involves service rates
Loading