CodeGuard: Structural Code Analysis with Graph Neural Networks for Memory Safety Vulnerability Detection in C/C++

CodeGuard: Structural Code Analysis with Graph Neural Networks for Memory Safety Vulnerability Detection in C/C++

ACL ARR 2026 January Submission3063 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Vulnerability Detection, Graph Neural Networks, Static Analysis, Source Code Representation, Software Security, Curriculum Learning

Abstract: Memory safety vulnerabilities in C and C++ remain a critical systemic risk. Traditional static analysis often suffers from high false positive rates, while state-of-the-art machine learning models typically rely on compiler-generated Intermediate Representations (IR), failing completely when analyzing non-compilable code fragments. We present CodeGuard, a vulnerability detection framework that leverages heterogeneous Message Passing Neural Networks (MPNNs) directly on source code. By constructing structural graphs that integrate Abstract Syntax Trees (ASTs), Control Flow Graphs (CFGs), and Data Flow, CodeGuard captures complex syntactic dependencies without requiring a build environment. Extensive evaluation on three real-world benchmarks (Big-Vul, Devign, MegaVul), demonstrates that CodeGuard achieves state-of-the-art performance, yielding an F1 score of 95.2\% on Big-Vul and 91.8\% on the massive MegaVul dataset. This approach eliminates the build-chain requirement while outperforming compilation-dependent baselines in both precision and recall.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Code generation and understanding, Security/Privacy

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: C. C++

Submission Number: 3063

Loading