Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generative AI, Memorization, LLMs, Law, Privacy
TL;DR: A technique to prevent LLMs from generating memorized training data with little to no impact on downstream performance.
Abstract: Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subsets of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set. We run extensive experiments training billion-scale LLaMA-2 models, both pre-trained and trained from scratch, and demonstrate significant reductions in extractable memorization with little to no impact on downstream benchmarks. _Code and checkpoints: https://github.com/ahans30/goldfish-loss_
Supplementary Material: zip
Primary Area: Generative models
Submission Number: 15907
Loading