PentraFormer: Learning Agents for Automated Penetration Testing via Sequence Modeling

Yunfei Wang, Shixuan Liu, Wenhao Wang, Cheng Zhu, Changjun Fan, Kuihua Huang, Chao Chen

Published: 01 Jan 2024, Last Modified: 15 May 2025iThings/GreenCom/CPSCom/SmartData/Cybermatics 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The exponential growth of computer networks has intensified the requirement for robust security measures, rendering penetration testing—an essential practice that assesses vulnerabilities by simulating attacks—critically important. Automated penetration testing (APT) has evolved from rule-based approaches to intelligent decision-making, such as reinforcement learning (RL), better emulating the adaptability and decision-making process of human penetration testers. However, using RL for APT is challenged by the high cost of real-network interactions, the scarcity of security datasets, and the unique complexities of APT scenarios, which include long decision sequences and delayed rewards, complicating RL efficacy and convergence. Inspired by the Decision Transformer’s proficiency in predicting action sequences for offline RL, coupled with its effectiveness in dealing with data efficiency and adaptability, we introduce a novel sequence-based methodology for designing APT agents. This approach is tailored to effectively manage the intricate aspects of data-scarce and complex APT scenarios. Our proposed model, PentraFormer, adeptly addresses these challenges prevalent in APT scenarios. The robustness of our model is attested to through extensive empirical evaluations conducted on APT-dedicated tasks.