Keywords: Computer Go, Monte-Carlo Tree Search, Reinforcement learning, Adaptive, Acceleration
Abstract: One of the most important AI research questions is to trade off computation versus performance, since "perfect rational" exists in theory but it is impossible to achieve in practice. Recently, Monte-Carlo tree search (MCTS) has attracted considerable attention due to the significant improvement of performance in varieties of challenging domains. However, the expensive time cost during search severely restricts its scope for applications. This paper proposes the Virtual MCTS (V-MCTS), a variant of MCTS that mimics the human behavior that spends adequate amounts of time to think about different questions. Inspired by this, we propose a strategy that converges to the ground truth MCTS search results with much less computation. We give theoretical bounds of the V-MCTS and evaluate the performance in $9 \times 9$ Go board games and Atari games. Experiments show that our method can achieve similar performances as the original search algorithm while requiring less than $50\%$ number of search times on average.
We believe that this approach is a viable alternative for tasks with limited time and resources.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2210.12628/code)
9 Replies
Loading