Evaluating LLM Memorization Using Soft Token Sparsity

Published: 05 Mar 2025, Last Modified: 09 Apr 2025SLLMEveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 4 pages)
Keywords: LLM memorization, sparse token
Abstract: Large language models (LLMs) memorize portions of their training data, posing threats to privacy and copyright protection. Existing work proposes several definitions of memorization, often with the goal of practical testing. In this work, we investigate compressive memorization and address its key limitation--computational inefficiency. To this end, we propose the adversarial sparsity ratio (ASR) as a proxy for compressive memorization. The ASR identifies sparse soft prompts that elicit target sequences, enabling a more computationally tractable assessment of memorization. Empirically, we show that ASR effectively distinguishes between memorized and non-memorized content. Furthermore, beyond verbatim memorization, ASR also captures memorization of underlying knowledge, offering a scalable and interpretable tool for analyzing memorization in LLMs.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 61
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview