Keywords: Differential privacy, Knowledge distillation, Model compression
Abstract: Privacy has emerged as a paramount concern in the development and deployment of language models. While over-parameterized models deliver exceptional performance, their deployment on resource-constrained devices (such as mobile or embedded systems) remains challenging. Model compression techniques are widely adopted to address this, yet they introduce additional privacy risks. Although differential privacy (DP) provides rigorous theoretical safeguards against data leakage, the noise injection required during compression often severely degrades model utility. Balancing high performance with strong privacy guarantees in compressed models thus remains a critical open challenge.
In this work, we introduce Privdistil, a DP-aware model compression framework that redesigns the compression pipeline to preserve utility while protecting sensitive data. Privdistil begins by training a domain classifier via DP-SGD on a hybrid dataset of public and private samples, thereby identifying public data most aligned with the private domain. It then performs model compression exclusively on these selected public samples, followed by fine-tuning the compressed model on the private dataset using DP-SGD. By shifting the compression burden to public data, Privdistil minimizes noise requirements and boosts training stability. Extensive experiments show that Privdistil consistently surpasses state-of-the-art DP compression methods across diverse datasets and architectures, delivering an average accuracy gain of over 3\% on GLUE benchmarks under a strict privacy budget of $\varepsilon = 1$.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Supplementary Material: zip
Submission Number: 10148
Loading