Differentially private massively distributed learning poses one key challenge when compared to differentially private centralized learning, where all data are aggregated at one party: minimizing communication overhead while achieving strong utility-privacy tradeoffs. The minimal amount of communication for distributed learning is non-interactive communication, i.e., each party only sends one message.
In this work, we propose two differentially private, non-interactive, distributed learning algorithms in a framework called Secure Distributed \helmet. This framework is based on what we coin blind averaging: each party locally learns and noises a model and all parties then jointly compute the mean of their models via a secure summation protocol (e.g., secure multiparty computation). The learning algorithms we consider for blind averaging are empirical risk minimizers (ERM) like SVMs and Softmax-activated single-layer perception (Softmax-SLP). We show that blind averaging preserves privacy if the models are averaged via secure summation and the objective function is smooth, Lipschitz, and strongly convex. We show that the objective function of Softmax-SLP fulfills these criteria, which implies leave-one-out robustness and might be of independent interest.
On the practical side, we provide experimental evidence that blind averaging for SVMs and Softmax-SLP can have a strong utility-privacy tradeoff: we reach an accuracy of $86$ % on CIFAR-10 for $\varepsilon = 0.36$ and $1{,}000$ users and of $44$ % on CIFAR-100 for $\varepsilon = 1.18$ and $100$ users, both after a SimCLR-based pre-training. As an ablation, we study the resilience of our approach to a strongly non-IID setting. On the theoretical side, we show that in the limit blind averaging hinge-loss based SVMs convergences to the centralized learned SVM. Our approach is based on the representer theorem and can be seen as a blueprint for finding convergence for other ERM problems like Softmax-SLP.