Keywords: large language model, privacy leakage, defense, cloud service
TL;DR: This paper exposes the significant vulnerability of user privacy when employing LLM cloud services, and for alleviating the privacy leakage, this paper introduces a plug-and-play distributed inference paradigm.
Abstract: Large language models (LLMs) have witnessed substantial growth in recent years. To leverage convenient LLM cloud services, users are inevitable to upload their prompts. Further, for tasks such as translation, reading comprehension, and summarization, related files or contexts are inherently required to be uploaded, whether they contain user privacy or not. Despite the rapid advancement of LLM capability, there has been a scarcity of research focusing on preserving user privacy during inference. To this end, this paper conducts a comprehensive study in this domain. Firstly, we demonstrate that (1) the embedding space of tokens is remarkably sparse, and (2) LLMs primarily function in the orthogonal subspace of embedding space, these two factors making privacy extremely vulnerable. Then, we analyze the structural characteristics of LLMs and design a distributed privacy-preserving inference paradigm which can effectively resist privacy attacks. Finally, we conduct a comprehensive evaluation of the defended models on mainstream tasks and find that low-bit quantization techniques can be well combined with our inference paradigm, achieving a balance between privacy, utility, and runtime memory efficiency.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Submission Number: 356
Loading