Keywords: PATE, Privacy, Heterogeneous data
TL;DR: We propose a modification to the teacher training process in PATE to handle heterogeneous data
Abstract: Differential privacy has become the standard system to provide privacy guarantees for user data in machine learning models. One of the popular techniques to ensure privacy is the Private Aggregation of Teacher Ensembles (PATE) framework. PATE trains an ensemble of teacher models on private data and transfers the knowledge to a student model, with rigorous privacy guarantees derived using differential privacy. So far, PATE has been shown to work assuming the public and private data are distributed homogeneously. We show that in the case of high mismatch (non iid-ness) in these distributions, the teachers suffer from high variance in their individual training updates, causing them to converge to vastly different optimum states. This leads to lower consensus and accuracy for data labelling. To address this, we propose a modification to the teacher training process in PATE, that incorporates teacher averaging and update correction which reduces the variance in teacher updates. Our technique leads to improved prediction accuracy of the teacher aggregation mechanism, especially for highly heterogeneous data. Furthermore, our evaluation shows our technique is necessary to sustain the student model performance, and allows it to achieve considerable gains over the original PATE in the utility-privacy metric.