Policy Optimization with Linear Temporal Logic Constraints

Cameron Voloshin; Hoang Minh Le; Swarat Chaudhuri; Yisong Yue

Policy Optimization with Linear Temporal Logic Constraints

Cameron Voloshin, Hoang Minh Le, Swarat Chaudhuri, Yisong Yue

Published: 31 Oct 2022, Last Modified: 13 Oct 2022NeurIPS 2022 AcceptReaders: Everyone

Keywords: Reinforcement Learning, RL, Linear Temporal Logic, LTL, Constrained, Policy, Optimization, Learning

TL;DR: A method for undiscounted policy optimization under LTL constraints with a generative model in the presence of unknown dynamics

Abstract: We study the problem of policy optimization (PO) with linear temporal logic (LTL) constraints. The language of LTL allows flexible description of tasks that may be unnatural to encode as a scalar cost function. We consider LTL-constrained PO as a systematic framework, decoupling task specification from policy selection, and an alternative to the standard of cost shaping. With access to a generative model, we develop a model-based approach that enjoys a sample complexity analysis for guaranteeing both task satisfaction and cost optimality (through a reduction to a reachability problem). Empirically, our algorithm can achieve strong performance even in low sample regimes.

Supplementary Material: zip

12 Replies

Loading