Integrating Planning and Deep Reinforcement Learning via Automatic Induction of Task Substructures

Published: 16 Jan 2024, Last Modified: 14 Apr 2024ICLR 2024 posterEveryoneRevisionsBibTeX
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Deep Reinforcement Learning, Classical Planning, Genetic Programming, Symbolic AI, Learning from Demonstration
Submission Guidelines: I certify that this submission complies with the submission instructions as described on
TL;DR: The work integrates planning and deep reinforcement learning to discover critical action schemata and build hierarchical networks. The framework induces symbolic knowledge of task substructures via genetic programming.
Abstract: Despite recent advancements, deep reinforcement learning (DRL) still struggles at learning sparse-reward goal-directed tasks. Classical planning excels at addressing hierarchical tasks by employing symbolic knowledge, yet most of the methods rely on assumptions about pre-defined subtasks. To bridge the best of both worlds, we propose a framework that integrates DRL with classical planning by automatically inducing task structures and substructures from a few demonstrations. Specifically, genetic programming is used for substructure induction where the program model reflects prior domain knowledge of effect rules. We compare the proposed framework to state-of-the-art DRL algorithms, imitation learning methods, and an exploration approach in various domains. Experimental results show that our proposed framework outperforms all the abovementioned algorithms in terms of sample efficiency and task performance. Moreover, our framework achieves strong generalization performance by effectively inducing new rules and composing task structures. Ablation studies justify the design of our induction module and the proposed genetic programming procedure.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Primary Area: reinforcement learning
Submission Number: 2272