Hidden No More: Attacking and Defending Private Third-Party LLM Inference

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We devise a simple attack that breaks existing private LLM inference methods, and we further introduce a robust multi-party private inference scheme.
Abstract: Recent advances in Large Language Models (LLMs) have led to widespread adoption of third-party inference services, raising critical privacy concerns. In this work, we introduce a novel reconstruction technique that can recover original prompts from hidden states with nearly perfect accuracy across multiple state-of-the-art LLMs in the increasingly important open-weights setting. Although the attack is conceptually simple, it has not -- to the best of our knowledge -- previously been described nor shown to work practically. Furthermore, our attack remains effective against various permutation and noise-based defenses, challenging assumptions about the security of previously proposed schemes. To address these vulnerabilities, we propose Cascade, a multi-party inference scheme that leverages sharding in the sequence dimension to retain privacy of the user input. Through theoretical analysis and empirical evaluation, we demonstrate that Cascade is secure against both our attack as well as previous methods, while maintaining computational and communication efficiency. Our findings highlight the importance of rigorous security analysis in privacy-preserving LLM inference and offer practical solutions for secure deployment.
Lay Summary: Large language models (LLMs) are often run by third-party services, raising serious concerns about user data privacy. This risk motivates the need for protocols which run LLMs on encrypted prompts instead of raw user data. While many such protocols are provably secure, they are too slow to be practical, so researchers have devised faster protocols that are probabilistically secure. In these protocols, the third party does not see user inputs, but receives permutations of internal LLM data instead. It was previously believed that such permutations are infeasible to reverse. Our paper introduces a new attack that can nearly perfectly reconstruct the original prompt from such data, highlighting serious security risks from the use of permutation-based schemes. Further, as an alternative mitigation, we develop an efficient protocol called Cascade, which splits the prompt among nodes and then performs sharded LLM inference. Through analysis of partial internal model data, we show that Cascade is secure against our attack and other attacks in literature. Our research emphasizes the importance of thoroughly testing AI systems for privacy vulnerabilities, and offers practical solutions to securely run LLMs via a third party, enabling safer use by the general public.
Link To Code: https://github.com/ritual-net/vma-external
Primary Area: Deep Learning->Large Language Models
Keywords: LLMs, Privacy, Hidden States
Submission Number: 12809
Loading