Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

Tomer Kordonsky; Amit LeVi; Maayan Yamin; Noam Benzimra; Avi Mendelson

Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

Tomer Kordonsky, Amit LeVi, Maayan Yamin, Noam Benzimra, Avi Mendelson

Published: 16 Jun 2026, Last Modified: 16 Jun 2026ICML 2026 Workshop DL4CEveryoneRevisionsBibTeXCC BY-NC 4.0

Keywords: Code Agents, AI Security, Benchmark and Evaluation, Threat Model

TL;DR: Predictable Vulnerability Patterns Persist in LLM-Generated Code Across Domains, novel attack and new evaluation framework.

Abstract: Large language models are increasingly deployed as core engines for automated code generation, accelerating software development while also emitting insecure programs. Existing defenses rely mainly on post-hoc scanning and treat each sample in isolation, leaving open a more predictive question: are these failures recurring enough that hidden vulnerabilities can be inferred from a model's visible outputs alone? We study this threat model, which we call vulnerability persistence, and introduce the Feature-Security Table (FSTab), a model-specific mapping from observable frontend features to recurrent backend vulnerabilities constructed from generated applications labeled with static-analysis findings. FSTab is queried using only public UI actions and endpoints, without source-code access, and across six code LLMs and five WebGenBench categories it achieves perfect attack success and coverage in multiple held-out categories while remaining strong under cross-category transfer, reaching up to 85.67% average attack success and 84.97% average coverage when the target category is excluded during construction. We further introduce a model-centric recurrence suite over features, prompt rephrasings, and application categories, and find that cross-category transfer exceeds within-category recurrence, suggesting that many insecure implementations reflect stable model-level coding habits rather than isolated prompt artifacts. These results expose a predictive black-box attack surface in LLM-generated software, motivate evaluation of recurring model behaviors, and our code is available at https://anonymous.4open.science/r/FSTab-024E/README.md

Submission Number: 39

Loading