PEARL: Differentially Private and Entropy-Aware Regulated Language Generation

PEARL: Differentially Private and Entropy-Aware Regulated Language Generation

ICLR 2026 Conference Submission14975 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Privacy in LLMs, Differential Privacy, Hallucination, Trustworthiness

TL;DR: We propose PEARL, an entropy-regulated framework for private language generation.

Abstract: Large language models (LLMs) commonly adopt Retrieval-Augmented Generation (RAG) to improve faithfulness. However, carefully crafted extraction prompts can elicit sensitive private information. Differential Privacy (DP) has therefore been integrated into LLM inference and is widely regarded as a standard safeguard; yet most work focuses on the utility–privacy trade-off, leaving the trustworthiness of DP outputs underexplored. To assess trustworthiness, we revisit the confidence gap (CG), which quantifies an LLM’s internal knowledge conflict. We show that CG correlates with both hallucination and exposure of personally identifiable information (PII). Building on this insight, we present PEARL, a CG‑guided, entropy‑aware private decoding framework. PEARL adaptively allocates the privacy budget across tokens and sentences based on CG, concentrating protection on spans likely to contain PII while stabilizing low‑confidence, hallucination‑prone regions. In experiments, PEARL improves response trustworthiness and robustness to PII extraction attacks.

Supplementary Material: zip

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 14975

Loading