Keywords: Large Language Models, Copyright Protection, Security and Privacy
Abstract: Large language models memorize and reproduce copyrighted content from their training data, raising significant legal concerns. Existing protection methods either exclude copyrighted data entirely, sacrificing model capabilities, or apply unstable regularization that causes training collapse. We introduce the first energy-based framework for copyright protection, reformulating memorization prevention as energy minimization rather than probability manipulation. Our key insight is that assigning higher energy to copyrighted sequences creates an exponential barrier to their reproduction, with protection strength naturally scaling with sequence length. We propose Adaptive Energy Regularization (AER), which dynamically balances copyright protection and model utility. We provide rigorous theoretical foundations: proving convergence under the Polyak-Łojasiewicz condition, establishing exponential suppression bounds that scale with sequence length, and guaranteeing robustness under distribution shift. Empirically, across \textbf{19} models ranging from 124M to 14B parameters, AER reduces verbatim reproduction from up to \textbf{99.1\%} to below \textbf{1\%} while preserving perplexity within \textbf{3.2\%} of baseline. Our energy-based approach provides a principled and stable solution to copyright protection, establishing a paradigm for controlling memorization in generative AI.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 15776
Loading