Abstract: Monte-Carlo Tree Search (MCTS) algorithms have been adopted by the artificial intelligence planning community for decades due to their ability to reason over long time horizons while providing guarantees on the convergence of the solution policy. In recent years, such algorithms have been modernized through the integration of Convolutional Neural Networks (CNNs) as part of the state evaluation process. However, both traditional and modern MCTS algorithms suffer from poor performance when the underlying reward signal of the environment is sparse. In this paper, we propose a verification-guided tree search solution which incorporates a reward shaping function within modern MCTS implementations. This function leverages the mathematical representation of the underlying objective by leveraging techniques from formal verification.
0 Replies
Loading