Keywords: Safe Reinforcement Learning, Physics Informed Machine Learning, Safety Critical Systems
TL;DR: AutoSafe: A Simple Safe Neural Policy Design
Abstract: Recognizing and avoiding danger is a fundamental capability of biological intelligence, yet this principle is less explored in the design of neural policies in today's artificial intelligence. We present AutoSafe, a novel architecture that embeds safety common sense directly into neural policies for safety sensitive applications. In particular, AutoSafe integrates a lightweight model-based Safety Evaluation Module that continuously evaluates the risk of safety violations and leverages a model-based safe policy as Safety Correction Module to correct potentially unsafe actions at runtime. By incorporating these two designs as part of the policy itself, AutoSafe can be seamlessly integrated into actor-critic based reinforcement learning algorithms to maximize performance while maintaining safety. We evaluate AutoSafe on a suite of continuous-control benchmarks, demonstrating that AutoSafe consistently outperforms other safe reinforcement learning baselines. Finally, we showcase the applicability of the proposed architecture in a real-world continual learning scenario on a cartpole system.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 9347
Loading