Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions

TMLR Paper3161 Authors

09 Aug 2024 (modified: 14 Nov 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large Language Models (LLMs) represent a significant advancement in artificial intelligence, finding applications across various domains. However, their reliance on massive internet-sourced datasets for training brings notable privacy issues, which are exacerbated in critical domains (e.g., healthcare). Moreover, certain application-specific scenarios may require fine-tuning these models on private data. This survey critically examines the privacy threats associated with LLMs, emphasizing the potential for these models to memorize and inadvertently reveal sensitive information. We explore current threats by reviewing privacy attacks on LLMs and propose comprehensive solutions for integrating privacy mechanisms throughout the entire learning pipeline. These solutions range from anonymizing training datasets to implementing differential privacy during training or inference and machine unlearning after training. Our comprehensive review of existing literature highlights ongoing challenges, available tools, and future directions for preserving privacy in LLMs. This work aims to guide the development of more secure and trustworthy AI systems by providing a thorough understanding of privacy preservation methods and their effectiveness in mitigating risks.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: - `Section 1 - Introduction`: Updated informative card of PII (Suggestions by `Reviewer iggB`) and updated taxonomy. - `Section 2 - Preliminaries`: Added definitions of Laplace/Gaussian/Exponential Mechanisms and fix typo in Rényi Differential Privacy (Suggestions by `Reviewer iggB`) - `Section 2 - Preliminaries`: Removed subsections on machine learning and deep learning (Suggestions by `Reviewer iggB`) - `Section 2 - Preliminaries`: Clarified the definition of LLMs as autoregressive generative models and explained why insights from models like BERT are still important to LLM privacy (Suggestions by `Reviewer Fx2K`) - `Section 3.2 - Membership Inferenxe Attacks (MIA)`: We revisited this section to limit the space dedicated to non-LLM attacks (as suggested by `Reviewer Fx2K`). In particular: - We removed `Section MIA with Shadow Models` and summarized the more relevant findings at the beginning of `Section 3.2` - We expanded the discussion around threshold-base MIAs at the end of the corresponding subsection (previously `Section 3.2.2`, now `3.2.1 - MIA with Thresholds`) - `Section 3.3 - Model Inversion and Stealing`: We revisited this Section by summarizing relevant pre-LLMs contributions (as suggested by `Reviewer Fx2K`) and added some relevant works to the discussion (`Reviwer Fx2K` and `Reviewer iggB`): - `Section 3.3.1 - Model Output Inversion`: We streamilined this Section removing pre-LLM contributions and added them in the general discussion at the beginning of `Section 3.3`, while adding some newer and relevant contributions to the discussion (as suggested by `Reviewer Fx2K`) - `Section 3.3.3 - Model Stealing (New!)`: We discuss the possibility of model stealing in the LLM context (as suggested by `Reviewer iggB`) - `Section 3.4 - Privacy Threats at Inference Time (New!)`: We added a new section discussing the risks related to prompting LLM with private data at inference time (as suggested by `Reviewer iggB`) - `Section 4 - Data`: Added clarifications for why data anonymization is important in the context of LLMs (`Reviewer iggB`); Added LLMs works that use data anonymization techniques in the pretraining phase (`Reviewer iggB`). - `Section 4.1 - Anonymization`: Added subparagraphs to increase readability (Suggestions by `Reviewer Fx2K`) - `Section 5.3 - Federated Learning`: Streamlined this section by keeping works relevant to LLMs in the main paper and moving other studies to the appendix. (Suggestions by `Reviewer iggB` and `Reviewer Fx2K`) - `Section 5.4 - Machine Unlearning`: Streamlined this section focuings more directly on recent advancements in LLM-specific unlearning techniques and expanded the discussion with some relevant contributions (Suggestion by `Reviewer iggB`) - `Section 5.5 (New!) - Homomorphic Encryption`: We added a new section with works using cryptography-based approaches (Suggestion by `Reviewer CECd`) - `Section 7 (New!) - Limitations and Future Directions`: We added a new section discussing current limitations and future directions for privacy preservation in LLMs (Suggestion by `Reviewer Fx2K`)
Assigned Action Editor: ~Tian_Li1
Submission Number: 3161
Loading