Memorization and Privacy Risks in Domain-Specific Large Language Models

Published: 05 Mar 2024, Last Modified: 08 May 2024ICLR 2024 R2-FM Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Model, Memorization, Privacy Leakage, Domain Adaptation
Abstract: Recent literature has explored the potential of fine-tuning LLMs on domain-specific corpora to improve performance on respective domains. However, the risk of memorizing and leaking sensitive information when these models learn from third-party custom fine-tuning data poses significant potential harm to individuals and organizations. To this end, as well as the widespread use of domain-specific LLMs in many high-stake domains, it is imperative to explore whether, and to what degree, domain-specific LLMs memorize fine-tuning data. Through a series of experiment, these models exhibit significant capacities for memorizing fine-tuning data, which result in significant privacy leakage. Furthermore, our investigations reveal that randomly removing certain words and rephrasing prompts show promising performance in mitigating memorization.
Submission Number: 84
Loading