Deep Reinforcement Learning-Based Multi-Layer Cascaded Resilient Recovery for Cyber-Physical Systems

Kai Zhong, Zhibang Yang, Siyang Yu, Kenli Li

Published: 2024, Last Modified: 18 Jan 2026IEEE Trans. Serv. Comput. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Cyber-physical systems (CPSs) are intricate systems integrating both physical and computational components. When these components fail due to malfunction or cyber-attack, potentially leading to significant damage or even collapse of the network topology. Cyber resilience, defined as the capability of a network to restore its function and structure after component failures, is crucial for ensuring that CPSs can sustain their operational capabilities in the face of complex disturbances. Recently, CPS resilience has garnered increasing attention, leading to the development of various resilience recovery methods. However, most existing studies address network and physical layer resilience in isolation, which hampers the ability to implement adaptive resilience recovery decisions across different systems. To overcome these limitations, we propose a multi-layered cascaded resilient recovery framework grounded in deep reinforcement learning. Initially, we synthesize the complex interactions between the information and physical layers in CPS resilience recovery from a global perspective, modeling the interrelations within CPSs. Subsequently, we introduce a hybrid resilient recovery strategy, encompassing both horizontal and vertical resilient recovery. The correlation matrix is used to partition the system into horizontal and vertical resilience slices. The resilient recovery strategy is subsequently modeled as an optimization problem using these slices. Following this, the Deep Recurrent Q-learning (DRQL) algorithm is introduced to implement the resilient recovery strategy in CPSs. While DRQL exhibits strong adaptability, it may lead to the sparse selection of critical samples, thereby hindering the learning process and convergence on essential experiences. To address this issue, we further develop the RR-DRQL algorithm, designed to identify the optimal CPS resilient recovery strategy. The RR-DRQL algorithm is rigorously proven to converge to the optimal solution through extensive theoretical analysis. Comprehensive experiments demonstrate that the RR-DRQL algorithm surpasses existing resilience recovery methods by 3.8%–25% regarding resilient policy recovery performance across realistic scenarios and various simulation platforms.

External IDs:dblp:journals/tsc/ZhongYYL24