Scalable Private Learning with PATE


Nov 07, 2017 (modified: Nov 07, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Recently, there has been increased attention to the privacy concerns raised by machine learning (ML) models trained on highly sensitive data, such as medical records or personal information. To resolve those concerns, one attractive approach is the Private Aggregation of Teacher Ensembles (PATE), which has shown that knowledge from an ensemble’s aggregated answers can be transferred to train models with strong differential-privacy guarantees. Yet, while promising, PATE applications have so far been limited to simple classification tasks like MNIST; its scalability to other tasks was unclear because of inherent limitations of the noise distributions proposed and its dependency on accurate aggregation and voting. In this work, we enable scalable applications of PATE. For this, we leverage two key insights: aggregation mechanisms with concentrated noise may mitigate these limitations and an ensemble of teachers designed to answer only questions on which they generally agree can still successfully transfer their knowledge to the student. Intuitively, such consensus answers also ought to incur lower privacy costs. With new noisy mechanisms and tighter privacy analyses, we utilize these insights to greatly improve PATE’s tradeoffs thereby leading to better scalability. In experiments, we improve the state-of-the-art on privacy-preserving ML benchmarks, and we also demonstrate the successful application of PATE with our new ideas to a real-world task with imbalanced and partly mislabeled data involving hundreds of classes.
  • Keywords: privacy, differential privacy, machine learning, deep learning