Do Androids Dream of Electric Fences? Safety-Aware Reinforcement Learning with Latent Shielding

Chloe He; Borja G. León; Francesco Belardinelli

Do Androids Dream of Electric Fences? Safety-Aware Reinforcement Learning with Latent Shielding

Chloe He, Borja G. León, Francesco Belardinelli

Published: 28 Jan 2022, Last Modified: 03 Oct 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: Model-Based Reinforcement Learning, Safety-Aware Reinforcement Learning, Shielding, World Models

Abstract: The growing trend of fledgling reinforcement learning systems making their way into real-world applications has been accompanied by growing concerns for their safety and robustness. In recent years, a variety of approaches have been put forward to address the challenges of safety-aware reinforcement learning; however, these methods often either require a handcrafted model of the environment to be provided beforehand, or that the environment is relatively simple and low-dimensional. We present a novel approach to safety-aware deep reinforcement learning in high-dimensional environments called latent shielding. Latent shielding leverages internal representations of the environment learnt by model-based agents to "imagine" future trajectories and avoid those deemed unsafe. We experimentally demonstrate that this approach leads to improved adherence to formally-defined safety specifications.

One-sentence Summary: RL agents can be made safer by getting them to "imagine" the consequences of their actions.

16 Replies

Loading