Runtime Learning Machine

Yihao Cai, Yanbing Mao, Lui Sha, Hongpeng Cao, Marco Caccamo

Published: 10 Jun 2025, Last Modified: 17 Oct 2025ACM Transactions on Cyber-Physical SystemsEveryoneRevisionsCC BY-SA 4.0

Abstract: This paper proposes the runtime learning machine for safety-critical learning-enabled cyber-physical systems (CPS). The learning machine has three interactive components: a high-performance (HP)-Student, a high-assurance (HA)-Teacher, and a Coordinator. The HP-Student is a high-performance but not fully verified Phy-DRL (physics-regulated deep reinforcement learning) agent that performs runtime learning in real CPS, using real-time sensor data from real-time physical environments. On the other hand, HA-Teacher is a verified but simplified design, focusing on safety-critical functions only. As a complementary, HA-Teacher's novelty lies in real-time patch for two missions: i) correcting unsafe learning of HP-Student, and ii) backing up safety. The Coordinator manages the interaction between HP-Student and HA-Teacher. Powered by the three interactive components, the runtime learning machine notably features i) assuring lifetime safety (i.e., safety guarantee in any runtime learning stage), ii) tolerating unknown unknowns, iii) addressing Sim2Real gap, and iv) automatic hierarchy learning (i.e., safety-first learning, and then high-performance learning). Experiments involving a cart-pole system and two quadruped robots, as well as comparisons with state-of-the-art safe DRL, fault-tolerant DRL, and approaches for addressing Sim2Real gap, demonstrate the learning machine's effectiveness and unique features.

External IDs:doi:10.1145/3744351