Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions

Michele Miranda; Elena Sofia Ruzzetti; Andrea Santilli; Fabio Massimo Zanzotto; Sébastien Bratières; Emanuele Rodolà

Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions

Michele Miranda, Elena Sofia Ruzzetti, Andrea Santilli, Fabio Massimo Zanzotto, Sébastien Bratières, Emanuele Rodolà

Published: 01 Feb 2025, Last Modified: 01 Feb 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) represent a significant advancement in artificial intelligence, finding applications across various domains. However, their reliance on massive internet-sourced datasets for training brings notable privacy issues, which are exacerbated in critical domains (e.g., healthcare). Moreover, certain application-specific scenarios may require fine-tuning these models on private data. This survey critically examines the privacy threats associated with LLMs, emphasizing the potential for these models to memorize and inadvertently reveal sensitive information. We explore current threats by reviewing privacy attacks on LLMs and propose comprehensive solutions for integrating privacy mechanisms throughout the entire learning pipeline. These solutions range from anonymizing training datasets to implementing differential privacy during training or inference and machine unlearning after training. Our comprehensive review of existing literature highlights ongoing challenges, available tools, and future directions for preserving privacy in LLMs. This work aims to guide the development of more secure and trustworthy AI systems by providing a thorough understanding of privacy preservation methods and their effectiveness in mitigating risks.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: - `Taxonomy`: We decided to remove the cited papers to avoid appearing prioritizing specific works and to ensure the taxonomy remains more adaptable to future developments, as the conceptual divisions are less likely to require updates. - `Section 3 - Attacks` : We have added an additional discussion of the methods initially formulated to attack other ML models and indicated more clearly how they were generalised for the LLM scenario (`Section 3.2` and `3.3`). - `Section 4 - Data`: Added a general discussion on incorporating data methods into the pre-processing pipeline of an LLM; Included considerations on how these methods might scale for datasets of the size used in LLMs. - `Section 4.1 - Anonymization`: Expanded the discussion on several methods, focusing on their general applicability, relevance to LLM training data, and scalability to large datasets. - `Section 4.2 - Anonymization with Differential Privacy`: Added details to better explain the usage of these methods in the context of LLMs; Included considerations on how the proposed methods would scale for the substantial datasets required for LLM pre-training. - `Section 5 - Model`: Clarified the generalizability of DP-SGD. - `Section 5.1.1 - Training Large Language Models with Differential Privacy`: Added discussion to emphasize potential implications when scaling the proposed methods—particularly those applied to smaller language models—to larger models. - `Section 5.1.2 - Fine-Tuning with Differential Privacy`: Expanded on methods developed for other networks to clarify their suitability for LLMs; Highlighted methods specifically designed for LLMs; Added discussions to underline the scalability potential of applicable methods. - `Section 5.1.3 - Parameter Efficient Fine-Tuning with Differential Privacy`: Included additional discussion to highlight how these methods are particularly suitable for LLMs. - `Section 5.2 - Inference with Differential Privacy`: Added discussion to underscore the specificity of these methods to LLMs. - `Acknowledgements`: Added Acknowledgements.

Assigned Action Editor: ~Tian_Li1

Submission Number: 3161

Loading