ROSARL: Reward-Only Safe Reinforcement Learning

Published: 07 Aug 2024, Last Modified: 26 Aug 2024RLSW 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Confirmation: Yes
Keywords: Reinforcement Learning, Deep Reinforcement Learning, Safety, Safe AI, Safe RL
Abstract: An important problem in reinforcement learning is designing agents that learn to solve tasks safely in an environment. A common solution is to define either a penalty in the reward function or a cost to be minimised when reaching unsafe states. However, designing reward or cost functions is non-trivial and can increase with the complexity of the problem. To address this, we investigate the concept of a Minmax penalty, the smallest penalty for unsafe states that leads to safe optimal policies, regardless of task rewards. We derive an upper and lower bound on this penalty by considering both environment diameter and controllability. Additionally, we propose a simple algorithm for agents to estimate this penalty while learning task policies. Our experiments demonstrate the effectiveness of this approach in enabling agents to learn safe policies in high-dimensional continuous control environments.
Submission Number: 8
Loading