Evaluating Privacy Leakage From In-Context Learning

Published: 27 Oct 2025, Last Modified: 27 Oct 2025NeurIPS Lock-LLM Workshop 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Differential Privacy, In-Context Learning, Large Language Model, Privacy leakage
Abstract: In-context Learning (ICL) is a promising approach for adapting pre-trained Large Language Models (LLMs) to a downstream task by providing relevant exemplars into a prompt. However, a recent concern is that these exemplars can contain privacy-sensitive information that can be unintentionally leaked/regurgitated through the LLM's output. Prior work has demonstrated this privacy vulnerability through Membership Inference Attacks. Despite this, there lacks a systematic evaluation of the factors that contribute to unintended privacy leakage from ICL. In this work, we introduce ICLInf, a metric for measuring the potential privacy leakage of ICL exemplars that is inspired by the analysis of data-dependent differential privacy (DP) and counterfactual influence. Our experimental results demonstrate that potential privacy leakage can be exacerbated by certain factors, such as parametric knowledge, model size, and exemplar position. Moreover, we show that ICLInf can be used to provide a tight privacy audit for DP sample-based ICL methods (exponential mechanism) up to $\epsilon = 10$.
Submission Number: 18
Loading