Graph Pre-training for Reconnaissance Perception in Automated Penetration Testing

Yunfei Wang, Shixuan Liu, Chao Zhang, Wenhao Wang, Jiandong Jin, Cheng Zhu, Changling Zhou

Published: 2024, Last Modified: 15 May 2025ICIC (3) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In automated penetration testing (APT), agents are tasked with identifying attack targets and formulating appropriate action plans within partially-observed network environments. The reasoning over the network based on the information gathering from reconnaissance is essential. However, existing reasoning methods show considerable neglect for computer networks and their unique characteristics. Additionally, despite Graph Neural Networks (GNNs) demonstrated efficacy in modeling graph structures, the scarcity of adequately labeled network data adds complexity to the training of GNNs. We present a novel method, termed Graph Pre-training for Reconnaissance Perception in Automated Penetration Testing (GPRP). This pioneering approach is designed to learn the invariant properties entailed in the structures and semantics of the computer networks from an extensive set of unlabeled and synthetic data during pre-training. Consequently, the resulting pre-trained model could swiftly adapt to target networks, after undergoing fine-tuning with very few network observations, and exhibits enhanced capabilities in reasoning network properties. Extensive experiments on both customized and FatTree networks articulate the efficacy of our model in tasks centered around network reasoning, such as node classification and link prediction tasks. Further verification of GPRP in a real-world local area network, underscores the practical usage of our method.