Embedding Learning-based Optimal Controllers with Assured Safety

Published: 2024, Last Modified: 14 May 2025CDC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We consider an off-policy reinforcement learning algorithm that gathers input and state data from a nonlinear system, and uses them to approximate the infinite-horizon optimal control for that system. However, as this algorithm relies on neural networks, its convergence depends on restrictive assumptions regarding the underlying neural network structure. Moreover, the derived approximate optimal controller that it yields is stabilizing only within a compact set $\Omega$ of the state space, leading to significant issues of robustness and safety for real-world implementations. Motivated by this, to increase the robustness and safety guarantees of controllers obtained by off-policy reinforcement learning procedures, we combine them with a novel safety net. The safety net is minimally interfering, leaving the approximate optimal controller unaltered within the compact set $\Omega$ in which it is valid and stabilizing. On the other hand, the safety net interferes with the approximate optimal controller whenever the set $\Omega$ is violated, so as to guarantee the boundedness and integrity of the closed loop. Since the proposed net is model-agnostic yet learning-free, it provides, for the first time, hard guarantees of safety, established by rigorous theoretical analysis and subsequently verified in simulations.
Loading