
\input{table/recon}

\input{figure/fig_gradcam}

In this paper, we proposed a novel approach to extending MCTS under factored action spaces that addresses the challenges posed by large combinatorial action spaces. The proposed method identifies compositional structures between the state and action variables from high-dimensional observations without the true environment model. Based on this, our method constructs state-conditioned abstraction for each node in an on-the-fly manner during the tree traversal. Experimental results demonstrate that our approach significantly improves the sample efficiency of vanilla MCTS under the factored action spaces. 

One of the promising future directions to extend our approach is to combine it with state abstraction methods such as bisimulation \citep{larsen1989bisimulation,ferns2014bisimulation,zhang2020learning}. Recall that our method is about action abstraction for efficient MCTS, such approaches are orthogonal to our work, and we expect that they can be seamlessly integrated into our approach. For example, it would be possible to apply MCTS with our proposed state-conditioned action abstraction on the latent \textit{abstract} state, which we defer to future work. 