Keywords: Memorization, Training Data Extraction, Copyright
TL;DR: inexpensive method to distinguish memorization from generalization in LLMs
Abstract: This work proposes a computationally inexpensive method to measure memorization of training data in LLMs (Large Language Models) while accounting for generalization. Prior approaches such as counterfactual memorization, have been computationally expensive, and therefore only been studied in limited settings. However, our new metric, Prior-Aware memorization, does not require training any new models, and can thus be directly applied to existing LLMs trained on large amounts of data. We evaluate our metric on two pre-trained models, Llama and OPT, trained on the Common Crawl and The Pile, respectively. We discover that for the largest models, 55-90% of the sequences that would be classified as ``memorized'' in earlier models are, in fact, generalizable sequences.
Primary Area: generative models
Submission Number: 20602
Loading