Safe Reinforcement Learning with Contrastive Risk Prediction

Safe Reinforcement Learning with Contrastive Risk Prediction

ICML 2024 Workshop AutoRL Submission27 Authors

29 May 2024 (modified: 17 Jun 2024)Submitted to AutoRL@ICML 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: safe reinforcement learning, contrastive risk prediction

Abstract: As safety violations can lead to severe consequences in real-world applications, the increasing deployment of Reinforcement Learning (RL) in safety-critical domains such as robotics has propelled the study of safe exploration for reinforcement learning (safe RL). In this work, we propose a risk preventive training method for safe RL, which learns a binary classifier based on contrastive sampling to predict the probability of a state-action pair leading to unsafe states. Based on the predicted risk probabilities, risk preventive trajectory exploration and optimality criterion modification can be simultaneously conducted to induce safe RL policies. We conduct experiments in robotic simulation environments. The results show the proposed approach outperforms existing model-free safe RL approaches, and yields comparable performance with the state-of-the-art model-based method.

Submission Number: 27

Loading