Robust Adaptive Multi-Step Predictive Shielding

Robust Adaptive Multi-Step Predictive Shielding

ICLR 2026 Conference Submission11778 Authors

Published: 26 Jan 2026, Last Modified: 06 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Safe Reinforcement Learning, Control Barrier functions, Model Predictive shielding

TL;DR: A robust multi-step control barrier function for a minimally invasive shielding without sacrificing performance

Abstract: Reinforcement learning for safety-critical tasks requires policies that are both high-performing and safe throughout the learning process. While model-predictive shielding is a promising approach, existing methods are often computationally intractable for the high-dimensional, nonlinear systems where deep RL excels, as they typically rely on a patchwork of local models. We introduce **RAMPS**, a scalable shielding framework that overcomes this limitation by leveraging a learned, linear representation of the environment's dynamics. This model can range from a linear regression in the original state space to a more complex operator learned in a high-dimensional feature space. The key is that this linear structure enables a robust, look-ahead safety technique based on a *multi-step Control Barrier Function (CBF)*. By moving beyond myopic one-step formulations, **RAMPS** accounts for model error and control delays to provide reliable, real-time interventions. The resulting framework is minimally invasive, computationally efficient, and built upon robust control-theoretic foundations. Our experiments demonstrate that **RAMPS** significantly reduces safety violations compared to existing safe RL methods while maintaining high task performance in complex control environments.

Supplementary Material: zip

Primary Area: reinforcement learning

Submission Number: 11778

Loading