How Private is Your Attention? Bridging Privacy with In-Context Learning

TMLR Paper6452 Authors

09 Nov 2025 (modified: 23 Nov 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: In-context learning (ICL)—the ability of transformer-based models to perform new tasks from examples provided at inference time—has emerged as a hallmark of modern language models. While recent works have investigated the mechanisms underlying ICL, its feasibility under formal privacy constraints remains largely unexplored. In this paper, we propose a differentially private pretraining algorithm for linear attention heads and present the first theoretical analysis of the privacy–accuracy trade-off for ICL in linear regression. Our results characterize the fundamental tension between optimization and privacy-induced noise, formally capturing behaviors observed in private training via iterative methods. Additionally, we show that our method is robust to adversarial perturbations of training prompts, unlike standard ridge regression. All theoretical findings are supported by extensive simulations across diverse settings.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Yuan_Cao1
Submission Number: 6452
Loading