Guiding MCTS with Generalized Policies for Probabilistic Planning

Anonymous

Guiding MCTS with Generalized Policies for Probabilistic Planning

Anonymous

18 Mar 2019 (modified: 05 May 2023)ICAPS 2019 Workshop HSDIP Blind SubmissionReaders: Everyone

Keywords: monte-carlo tree search, uct, action-schema networks, probabilistic planning, generalized policies, domain-independent planning

TL;DR: Techniques for combining generalized policies with search algorithms to exploit the strengths and overcome the weaknesses of each when solving probabilistic planning problems

Abstract: We examine techniques for combining generalized policies with search algorithms to exploit the strengths and overcome the weaknesses of each when solving probabilistic planning problems. The Action Schema Network (ASNet) is a recent contribution to planning that uses deep learning and neural networks to learn generalized policies for probabilistic planning problems. ASNets are well suited to problems where local knowledge of the environment can be exploited to improve performance, but may fail to generalize to problems they were not trained on. Monte-Carlo Tree Search (MCTS) is a forward-chaining state space search algorithm for optimal decision making which performs simulations to incrementally build a search tree and estimate the values of each state. Although MCTS can achieve state-of-the-art results when paired with domain-specific knowledge, without this knowledge, MCTS requires a large number of simulations in order to obtain reliable estimates in the search tree. By combining ASNets with MCTS, we are able to improve the capability of an ASNet to generalize beyond the distribution of problems it was trained on, as well as enhance the navigation of the search space by MCTS.

13 Replies

Loading