Keywords: NLP, privacy, LLM, data minimization, data sanitization
TL;DR: We introduce a framework that operationalizes data minimization of LLM prompting as the least privacy-revealing prompt that preserves task utility, showing more capable LLMs tolerate greater minimization and establishing a predictive baseline.
Abstract: The rapid deployment of large language models (LLMs) in consumer applications has led to frequent exchanges of personal information. To obtain useful responses, users often share more than necessary, increasing privacy risks via memorization, context-based personalization, or security breaches. We present a framework to formally define and operationalize data minimization: for a given user prompt and a response model, quantifying the least privacy-revealing disclosure that maintain utility, and propose a priority-queue tree search to locate this optimal point within a privacy-ordered transformation space. We evaluated the framework on four datasets spanning open-ended conversations (ShareGPT, WildChat) and knowledge-intensive tasks with single-ground-truth answers (CaseHold, MedQA), quantifying the achievable data minimization with nine LLMs as the response model. Our results demonstrate that, for the same user prompts, larger frontier LLMs can tolerate stronger levels of data minimization while maintaining task quality. In contrast, smaller open-source models are less robust to aggressive minimization. In addition, by comparing with our oracles, we show that LLMs are poor predictors of data minimization, exhibiting a consistent bias toward abstraction that leads to significant oversharing. By providing an oracle for data minimization, our framework establishes a principled and empirically validated way to balance privacy preservation with task utility in LLM applications.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 22557
Loading