Keywords: Statistical opacity, unlearnable code, AI-resistant software, foundation models, software security
TL;DR: We move from "How do we keep adversaries from seeing our code?" to "How do we ensure that seeing our code doesn't help?"
Abstract: The LLM foundation model era has inverted a fundamental assumption in software engineering. Code, once written, no longer belongs exclusively to its creators. Any publicly accessible code becomes training data, absorbed into models that can reproduce, adapt, and redistribute it without consent. This paper argues that such circumstances represent not merely a legal or ethical challenge, but a technical one requiring new defensive primitives. We introduce the concept of statistical opacity, defined as the deliberate design of code representations that resist neural pattern extraction while preserving human readability and machine executability. We articulate a research agenda spanning theory, mechanisms, tools, and evaluation. Statistical opacity will become as fundamental to software security as cryptography became to data security. Just as the community learned to design systems assuming adversaries could intercept communications, we must now learn to design systems assuming adversaries can learn from code.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public.
Paper Type: Short papers (i.e., vision, new ideas, and position papers). 2–4 pages
Reroute: false
Submission Number: 44
Loading