Optimal Runtime Assurance via Reinforcement Learning

Kristina Miller, Christopher K. Zeitler, William Shen, Kerianne Hobbs, John Schierman, Mahesh Viswanathan, Sayan Mitra

Published: 01 Jan 2024, Last Modified: 11 Mar 2025ICCPS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: AI and Machine Learning could enhance autonomous systems, provided the risk of safety violations could be mitigated. Specific instances of runtime assurance (RTA) have been successful in safely testing untrusted, learning-enabled controllers, but a general design methodology for RTA remains a challenge. The problem is to create a logic that assures safety by switching to a safety (or backup) controller as needed, while maximizing a performance criteria, such as the utilization of the untrusted controller. Existing RTA design strategies are well-known to be overly conservative and can lead to safety violations. In this paper, we formulate the optimal RTA design problem and present an approach for solving it. Our approach relies on reward shaping and reinforcement learning. It can guarantee that safety or other hard constraints are met and leverages machine learning technologies for scalability. We have implemented this algorithm and present extensive experimental results on challenging scenarios involving aircraft models, multi-agent systems, realistic simulators, and complex safety requirements. Our experimental results suggest that this RTA design approach can be effective in guaranteeing hard safety constraints while increasing utilization over existing approaches.