Robust Adaptive Multi-Step Predictive Shielding

ICLR 2026 Conference Submission11778 Authors

18 Sept 2025 (modified: 25 Nov 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Safe Reinforcement Learning, Control Barrier functions, Model Predictive shielding
TL;DR: A robust multi-step control barrier function for a minimally invasive shielding without sacrificing performance
Abstract: Reinforcement learning for safety-critical tasks requires policies that are both high-performing and safe throughout the learning process. While model-predictive shielding is a promising approach, existing methods are often computationally intractable for the high-dimensional, nonlinear systems where deep RL excels, as they typically rely on a patchwork of local models. We introduce **RAMPS**, a scalable shielding framework that overcomes this limitation by leveraging a learned, linear representation of the environment's dynamics. This model can range from a linear regression in the original state space to a more complex operator learned in a high-dimensional feature space. The key is that this linear structure enables a robust, look-ahead safety technique based on a *multi-step Control Barrier Function (CBF)*. By moving beyond myopic one-step formulations, **RAMPS** accounts for model error and control delays to provide reliable, real-time interventions. The resulting framework is minimally invasive, computationally efficient, and built upon robust control-theoretic foundations. Our experiments demonstrate that **RAMPS** significantly reduces safety violations compared to existing safe RL methods while maintaining high task performance in complex control environments.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 11778
Loading