APRIL: Towards Scalable and Transferable Autonomous Penetration Testing in Large Action Space via Action Embedding
Abstract: Penetration testing (pentesting) assesses cybersecurity through simulated attacks, while the conventional manual-based method is costly, time-consuming, and personnel-constrained. Reinforcement learning (RL) provides an agent-environment interaction learning paradigm, making it a promising way for autonomous pentesting. However, agents’ scalability in large action spaces and policy transferability across scenarios limit the applicability of RL-based autonomous pentesting. To address these challenges, we present a novel autonomous pentesting framework based on reinforcement learning (namely APRIL) to train agents that are scalable and transferable in large action spaces. In APRIL, we construct realistic, bounded, host-level state space via embedding techniques to avoid the complexities of dealing with unbounded network-level information. We employ semantic correlations between pentesting actions as prior knowledge to represent discrete action space into a continuous and semantically meaningful embedding space. Agents are then trained to reason over actions within the action embedding space, where two key methods are applied: an upper-confidence bound-based action refinement method to encourage efficient exploration, and a distance-aware loss to improve learning efficiency and generalization performance. We conduct experiments in simulated scenarios constructed based on virtualized vulnerable environments. The results demonstrate APRIL's scalability in large action spaces and its ability to facilitate policy transfer across diverse scenarios.
Loading