Beyond Per-Question Privacy: Multi-Query Differential Privacy for RAG Systems

Published: 29 Sept 2025, Last Modified: 12 Oct 2025NeurIPS 2025 - Reliable ML WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Differential privacy, Retrieval-Augmented Generation
TL;DR: A differentially private RAG algorithm for online multi-query release with per-document privacy accounting
Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving documents from an external dataset at inference time. When the external dataset contains sensitive and private information, prior work shows that without any protection, the RAG system has the risk to leak this information, which might hurt the data owner's privacy. The existing work has studied protect the information not leaked from a single-query from RAG under differential privacy (DP). In this paper, we focus on the more practical setting where the DP is extended to multiple queries from RAG. We propose two new algorithms that ensure DP in multi-query RAG. Our first method, \dpfixtau, applies an individual privacy accounting framework, allowing the privacy cost to depend on how often each document is retrieved rather than the total number of queries given. Our second method, \dpadaptovetau, further improves efficiency in the individual privacy accounting framework by adaptively releasing private query-specific thresholds for more precisely relevant document selection. Experiments across four question datasets and three LLMs show that our methods answer $100$ queries under $\varepsilon=10$, while baseline methods require $\varepsilon=1000$ for comparable utility. We also highlight scenarios where our approach outperforms fixed-threshold baselines and discuss when individual accounting is preferable to subsampling-based techniques.
Submission Number: 143
Loading