Track: long paper (up to 4 pages)
Keywords: LLM memorization, sparse token
Abstract: Large language models (LLMs) memorize portions of their training data, posing threats to privacy and copyright protection. Existing work proposes several definitions of memorization, often with the goal of practical testing. In this work, we investigate compressive memorization and address its key limitation--computational inefficiency. To this end, we propose the adversarial sparsity ratio (ASR) as a proxy for compressive memorization. The ASR identifies sparse soft prompts that elicit target sequences, enabling a more computationally tractable assessment of memorization. Empirically, we show that ASR effectively distinguishes between memorized and non-memorized content. Furthermore, beyond verbatim memorization, ASR also captures memorization of underlying knowledge, offering a scalable and interpretable tool for analyzing memorization in LLMs.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 61
Loading