Exploring Under Constraints with Model-Based Actor-Critic and Safety Filters

Ahmed Agha; Baris Kayalibay; Atanas Mirchev; Patrick van der Smagt; Justin Bayer

Exploring Under Constraints with Model-Based Actor-Critic and Safety Filters

Ahmed Agha, Baris Kayalibay, Atanas Mirchev, Patrick van der Smagt, Justin Bayer

Published: 05 Sept 2024, Last Modified: 08 Nov 2024CoRL 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Model-based RL, Safe RL, Safety Filter, Exploration

TL;DR: We combine constrained model-based policy optimization with planning-based safety filters as a backup policy to reduce constraint violation rates during exploration.

Abstract:

Applying reinforcement learning (RL) to learn effective policies on physical robots without supervision remains challenging when it comes to tasks where safe exploration is critical. Constrained model-based RL (CMBRL) presents a promising approach to this problem. These methods are designed to learn constraint-adhering policies through constrained optimization approaches. Yet, such policies often fail to meet stringent safety requirements during learning and exploration. Our solution ``CASE'' aims to reduce the instances where constraints are breached during the learning phase. Specifically, CASE integrates techniques for optimizing constrained policies and employs planning-based safety filters as backup policies, effectively lowering constraint violations during learning and making it a more reliable option than other recent constrained model-based policy optimization methods.

Supplementary Material: zip

Spotlight Video: mp4

Publication Agreement: pdf

Student Paper: no

Submission Number: 631

Loading