Keywords: Large Language Model, Memorization, Privacy Leakage, Domain Adaptation
Abstract: Recent literature has explored the potential of fine-tuning LLMs on domain-specific corpora to improve performance on respective domains. However, the risk of memorizing and leaking sensitive information when these models learn from third-party custom fine-tuning data poses significant potential harm to individuals and organizations. To this end, as well as the widespread use of domain-specific LLMs in many high-stake domains, it is imperative to explore whether, and to what degree, domain-specific LLMs memorize fine-tuning data. Through a series of experiment, these models exhibit significant capacities for memorizing fine-tuning data, which result in significant privacy leakage. Furthermore, our investigations reveal that randomly removing certain words and rephrasing prompts show promising performance in mitigating memorization.
Submission Number: 84
Loading