Memorization and Privacy Risks in Domain-Specific Large Language Models

Xinyu Yang; Zichen Wen; Wenjie Qu; Zhaorun Chen; Zhiying Xiang; Beidi Chen; Huaxiu Yao

Memorization and Privacy Risks in Domain-Specific Large Language Models

Xinyu Yang, Zichen Wen, Wenjie Qu, Zhaorun Chen, Zhiying Xiang, Beidi Chen, Huaxiu Yao

Published: 05 Mar 2024, Last Modified: 08 May 2024ICLR 2024 R2-FM Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, Memorization, Privacy Leakage, Domain Adaptation

Abstract: Recent literature has explored the potential of fine-tuning LLMs on domain-specific corpora to improve performance on respective domains. However, the risk of memorizing and leaking sensitive information when these models learn from third-party custom fine-tuning data poses significant potential harm to individuals and organizations. To this end, as well as the widespread use of domain-specific LLMs in many high-stake domains, it is imperative to explore whether, and to what degree, domain-specific LLMs memorize fine-tuning data. Through a series of experiment, these models exhibit significant capacities for memorizing fine-tuning data, which result in significant privacy leakage. Furthermore, our investigations reveal that randomly removing certain words and rephrasing prompts show promising performance in mitigating memorization.

Submission Number: 84

Loading