Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

Tomer Kordonsky; Amit LeVi; Maayan Yamin; Noam Benzimra; Avi Mendelson

Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

Tomer Kordonsky, Amit LeVi, Maayan Yamin, Noam Benzimra, Avi Mendelson

Published: 23 May 2026, Last Modified: 23 May 2026ICML 2026 AIWILDEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM-Generated Software, Black-Box Security, Vulnerability Persistence, Agentic Workflows, Exploit Discovery, Model Fingerprinting

TL;DR: We show that LLM code generators repeat vulnerabilities, enabling a black-box attack that extracts recurring flaws from LLM-generated software without source code access, and we introduce an evaluation framework to measure this risk.

Abstract: Large language models are increasingly deployed as core engines for automated code generation, accelerating software development while also emitting insecure programs. Existing defenses rely mainly on post-hoc scanning and treat each sample in isolation, leaving open a more predictive question: are these failures recurring enough that hidden vulnerabilities can be inferred from a model's visible outputs alone? We study this threat model, which we call vulnerability persistence, and introduce the Feature-Security Table (FSTab), a model-specific mapping from observable frontend features to recurrent backend vulnerabilities constructed from generated applications labeled with static-analysis findings. FSTab is queried using only public UI actions and endpoints, without source-code access, and across six code LLMs and five WebGenBench categories it achieves perfect attack success and coverage in multiple held-out categories while remaining strong under cross-category transfer, reaching up to 85.67% average attack success and 84.97% average coverage when the target category is excluded during construction. We further introduce a model-centric recurrence suite over features, prompt rephrasings, and application categories, and find that cross-category transfer exceeds within-category recurrence, suggesting that many insecure implementations reflect stable model-level coding habits rather than isolated prompt artifacts. These results expose a predictive black-box attack surface in LLM-generated software, motivate evaluation of recurring model behaviors, and our code is available at https://anonymous.4open.science/r/FSTab-024E/README.md

Track: Regular Paper (9 pages)

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 208

Loading