Track: Long Paper (9 pages including references)
Previous Publication: Yes, the submission has already been published or acccepted at another conference.
Keywords: Landmarks, Probabilistic Planning, Monte Carlo Tree Search, UCT
Abstract: Landmarks—conditions that must be satisfied at some point in every solution plan—have contributed to major advancements in classical planning, but they have seldom been used in stochastic domains. We formalize probabilistic landmarks and adapt the UCT algorithm to leverage them as subgoals to decompose MDPs; core to the adaptation is balancing between greedy landmark achievement and final goal achievement. Our results in benchmark domains show that well-chosen landmarks can significantly improve the performance of UCT in online probabilistic planning, while the best balance of greedy versus long-term goal achievement is problem-dependent. The results suggest that landmarks can provide helpful guidance for anytime algorithms solving MDPs.
Supplementary Material: pdf
Submission Number: 7
Loading