Reinforcement Learning for Adaptive Cyber Defense Against Zero-Day Attacks

Zhisheng Hu, Ping Chen, Minghui Zhu, Peng Liu

Published: 01 Jan 2019, Last Modified: 11 Apr 2025Adversarial and Uncertain Reasoning for Adaptive Cyber Defense 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this chapter, we leverage reinforcement learning as a unified framework to design effective adaptive cyber defenses against zero-day attacks. Reinforcement learning is an integration of control theory and machine learning. A salient feature of reinforcement learning is that it does not require the defender to know critical information of zero-day attacks (e.g., their attack targets, and the locations of the vulnerabilities). This information is difficult, if not impossible, for the defender to gather in advance. The reinforcement learning based schemes are applied to defeat three classes of attacks: strategic attacks where the interactions between an attacker and a defender are modeled as a non-cooperative game; non-strategic random attacks where the attacker chooses its actions by following a predetermined probability distribution; and attacks depicted by Bayesian attack graphs where the attacker exploits combinations of multiple known or zero-day vulnerabilities to compromise machines in a network.