Hot PATE: Private Aggregation of Distributions  for Diverse Tasks

Edith Cohen; Benjamin Cohen-Wang; Xin Lyu; Jelani Nelson; Tamas Sarlos; Uri Stemmer

Hot PATE: Private Aggregation of Distributions for Diverse Tasks

Edith Cohen, Benjamin Cohen-Wang, Xin Lyu, Jelani Nelson, Tamas Sarlos, Uri Stemmer

22 Jan 2025 (modified: 18 Jun 2025)Submitted to ICML 2025EveryoneRevisionsBibTeXCC BY 4.0

TL;DR: A PATE design that has high utility for diverse tasks

Abstract: The Private Aggregation of Teacher Ensembles (PATE) framework is a versatile approach to privacy-preserving machine learning. In PATE, responses made based on different parts of sensitive data are aggregated into a single response in a privacy-preserving way. Recently, multiple works applied PATE for tasks such as sequential text generation that are inherently diverse (or "hot"), with multiple valid responses. These designs, however, suffer from tension between diversity and privacy -- since diversity in the responses reduces agreement which forces the aggregation to use smaller noise scales and thus incur higher privacy loss. But limiting diversity of the aggregate response is undesirable since in modern large language models, the very knowledge we want to transfer is encapsulated in the response distribution. We propose \emph{hot PATE} that is tailored for the diverse setting where responses are distributions. We formally define \emph{preserving diversity} and design an efficient aggregation method that provably transfers the diversity to the (randomized) aggregate response while incurring no privacy penalty. The method can be implemented using an API access to proprietary models and used as a plug-in replacement for the baseline ``cold'' PATE in existing tools. We demonstrate empirically the potential of hot PATE for an order of magnitude improvement in a task of in-context learning via prompts.

Primary Area: Social Aspects->Privacy

Keywords: PATE, diverse tasks, language generation

Submission Number: 7654

Loading