Recurrent Macro Actions Generator for POMDP Planning

Yuanchu Liang, Hanna Kurniawati

Published: 30 Sept 2023, Last Modified: 29 Jan 20262023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)EveryoneRevisionsCC BY 4.0

Abstract: Many planning problems in robotics require long planning horizon and uncertain in nature. The Par-tially Observable Markov Descision Process (POMDP) is a mathematically principled framework for planning under uncertainty. To alleviate the difficulties of computing good approximate POMDP solutions for long horizon problems, one often plans using macro actions, where each macro action is a chain of primitive actions. Such a strategy reduces the effective planning horizon of the problem, and hence reduces the computational complexity for solving. The difficulty is in generating a set of suitable macro actions. In this paper, we present a simple recurrent neural network that learns to generate suitable sets of candidate macro actions that exploits environment information. Key to this learning method is to represent the raw partial information from the environment as a latent problem instance, and sequentially generate macro actions conditioned on the past information. We compare our proposed method with state-of-the-art [1] on four dif-ferent long horizon planning tasks with various difficulties. The results indicate the quality of the policies computed using macro actions generated by our proposed method consistently exceeds benchmarks. Our implementation can be accessed at https://github.com/YC-Liang/Recurrent-Macro-Action-Generator.