Abstract: We argue that many AI planning problems should be viewed as process-oriented, where the aim is to produce a policy or behavior strategy with no termination condition in mind, as opposed to goal-onented. The full power of Markov decision models, adopted recently for AI planning, becomes apparent with process-oriented problems. The question of appropriate optimality criteria becomes more critical in this case, we argue that average reward optimal is most suitable While construction of averageoptimal policies involves a number of subtleties and computational difficulties, certain aspects of the problem can be solved using compact action representations such as Bayes nets. In particular, we provide an algorithm that identifies the structure of the Markov process underlying a planning problem - a crucial element of constructing average optimal policies - without explicit enumeration of the problem state space.
0 Replies
Loading