Keywords: code security, transfer learning, reinforcement learning
Abstract: While large language models (LLMs) show promise in software security, they struggle to comprehend the underlying logic of vulnerabilities and are often confounded by the high semantic similarity between flawed and patched code. To overcome these limitations, we introduce HeroCode, a novel model designed to transform general-purpose LLMs into specialized vulnerability identification experts. HeroCode's advancement stems from its unique training on a synthetically generated dataset that explicitly details the reasoning behind both vulnerability exploits and their remediations. This reasoning-rich data is leveraged by our core Hierarchical Epistemic Robust Optimization (HERO) architecture, a framework that integrates distributional robustness across multiple abstraction levels to compel a deeper understanding of fundamental security patterns over superficial semantics. Empirical evaluations demonstrate that HeroCode substantially outperforms existing methods, setting new state-of-the-art performance records on the PrimeVul and SVEN benchmarks. When integrated with Qwen2.5-Coder-7B-Instruct, HeroCode achieves 60.66\% accuracy on PrimeVul—surpassing even GPT-4's 52.24\%—proving its exceptional capability in distinguishing between vulnerable and patched code implementations.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 21794
Loading