Lagrangian Generative Adversarial Imitation Learning with SafetyDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: safe imitation learning, inverse reinforcement learning
Abstract: Imitation Learning (IL) merely concentrates on reproducing expert behaviors and could take dangerous actions, which is unbearable in safety-critical scenarios. In this work, we first formalize a practical task of safe imitation learning (Safe IL), which has been long neglected. Taking safety into consideration, we augment Generative Adversarial Imitation Learning (GAIL) with safety constraints and then relax it as an unconstrained saddle point problem by utilizing a Lagrange multiplier, dubbed LGAIL. Then, we apply a two-stage optimization framework to solve LGAIL. Specifically, a discriminator is firstly optimized to measure the similarity between the agent-generated state-action pairs and the expert ones, and then forward reinforcement learning is employed to improve the similarity while considering safety concerns via a Lagrange multiplier. Besides, we provide a theoretical interpretation of LGAIL, which indicates that the proposed LGAIL can be guaranteed to learn a safe policy from unsafe expert data. At last, extensive experiments in OpenAI Safety Gym conclude the effectiveness of our approach.
One-sentence Summary: We propose a method to conduct safe imitation learning from expert data that are not guaranteed to be safe.
9 Replies

Loading