Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees

Kai-Chieh Hsu; Allen Z. Ren; Duy Phuong Nguyen; Anirudha Majumdar; Jaime Fernández Fisac

Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees

Kai-Chieh Hsu, Allen Z. Ren, Duy Phuong Nguyen, Anirudha Majumdar, Jaime Fernández Fisac

Published: 23 Nov 2022, Last Modified: 15 Jun 2025TEAReaders: Everyone

Keywords: Reinforcement Learning, Sim-to-Real Transfer, Safety Analysis, Generalization

TL;DR: We propose a two-stage training called Sim-to-Lab-to-Real that bridges the reality gap with a probabilistically guaranteed safety-aware policy distribution.

Abstract: Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. In this paper, we propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution.. To improve safety, we apply a dual policy setup where a performance policy is trained using the cumulative task reward and a backup (safety) policy is trained by solving the safety Bellman Equation based on Hamilton-Jacobi reachability analysis. In \textit{Sim-to-Lab} transfer, we apply a supervisory control scheme to shield unsafe actions during exploration; in \textit{Lab-to-Real} transfer, we leverage the Probably Approximately Correct (PAC)-Bayes framework to provide lower bounds on the expected performance and safety of policies in unseen environments. We empirically study the [proposed framework for ego-vision navigation in two types of indoor environments including a photo-realistic one. We also demonstrate strong generalization performance through hardware experiments in real indoor spaces with a quadrupedal robot.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/sim-to-lab-to-real-safe-reinforcement/code)

4 Replies

Loading