Safety-Polarized and Prioritized Reinforcement Learning

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0
Abstract: Motivated by the first priority of safety in many real-world applications, we propose \textsc{MaxSafe}, a chance-constrained bi-level optimization framework for safe reinforcement learning. \textsc{MaxSafe} first minimizes the unsafe probability and then maximizes the return among the safest policies. We provide a tailored Q-learning algorithm for the \textsc{MaxSafe} objective, featuring a novel learning process for \emph{optimal action masks} with theoretical convergence guarantees. To enable the application of our algorithm to large-scale experiments, we introduce two key techniques: \emph{safety polarization} and \emph{safety prioritized experience replay}. Safety polarization generalizes the optimal action masking by polarizing the Q-function, which assigns low values to unsafe state-action pairs, effectively discouraging their selection. In parallel, safety prioritized experience replay enhances the learning of optimal action masks by prioritizing samples based on temporal-difference (TD) errors derived from our proposed state-action reachability estimation functions. This approach efficiently addresses the challenges posed by sparse cost signals. Experiments on diverse autonomous driving and safe control tasks show that our methods achieve near-maximal safety and an optimal reward-safety trade-off.
Lay Summary: We present \textsc{MaxSafe}, a framework for training AI agents that prioritize safety before performance. It first avoids risky actions, then selects the best among the safest options. To support effective learning, we introduce two techniques: masking unsafe actions and prioritizing experiences related to safety. Our approach achieves strong safety and performance in tasks like autonomous driving and classic control.
Link To Code: https://github. com/FrankSinatral/Safety-PP.git
Primary Area: Reinforcement Learning
Keywords: Safe Reinforcement Learning
Submission Number: 14741
Loading