Abstract: In the domain of safety-critical applications, there is a pressing need for control methods that are not only scalable but also verifiable. Traditional control strategies, which rely on certification processes, often struggle to adapt to the complexity inherent in these systems. Conversely, while reinforcement learning (RL) techniques show promise in scaling effectively, their verifiability remains a significant challenge. Our research introduces a novel approach that bridges this gap by offering strong guarantees on constraint satisfaction for general dynamical systems, diverging from previous works that primarily focus on certification. Our study delves into the prerequisites for the verification of learned Value Functions (VFs) through the lens of Control Barrier Function (CBF) attributes. We leverage the foundational principles of safe VFs (SVFs) to design a reward mechanism that inherently guides the optimal VF to embody a CBF. Our approach allows the resulting VF to restrict subsequent policy actions to safe trajectories, in the context of complex control problems. Furthermore, we investigate the feasibility of conducting formal verification of VFs by exploiting CBF properties. This research marks a significant advancement towards achieving control methods that are both scalable to complex systems and amenable to rigorous verification processes. Through the integration of learning-based control with traditional safety guarantees, we pave the way for more reliable and efficient solutions in safety-critical applications. The code and supplementary video can be found under our webpage1.
Loading