Constrained Hierarchical Deep Reinforcement Learning with Differentiable Formal Specifications

Zikang Xiong; Joe Eappen; Ahmed H Qureshi; Suresh Jagannathan

Constrained Hierarchical Deep Reinforcement Learning with Differentiable Formal Specifications

Zikang Xiong, Joe Eappen, Ahmed H Qureshi, Suresh Jagannathan

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Deep Reinforcement Learning, Differentiable Formal Specification Language, Robot Navigation, Robot Planning and Control

TL;DR: This paper uses differentiable formal specifications to constrain the policy updates in hierarchical deep reinforcement learning.

Abstract: Formal logic specifications are a useful tool to describe desired agent behavior and have been explored as a means to shape rewards in Deep Reinforcement Learning (DRL) systems over a variety of problems and domains. Prior work, however, has failed to consider the possibility of making these specifications differentiable, which would yield a more informative signal of the objective via the specification gradient. This paper examines precisely such an approach by exploring a Lagrangian method to constrain policy updates using a differentiable style of temporal logic specifications that associates logic formulae with real-valued quantitative semantics. This constrained learning mechanism is then used in a hierarchical setting where a high-level specification-guided neural network path planner works with a low-level control policy to navigate through planned waypoints. The effectiveness of our approach is demonstrated over four robot dynamics with five different types of Linear Temporal Logic (LTL) specifications. Our demo videos are collected at https://sites.google.com/view/schrl.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

23 Replies

Loading