SCALR: Communication-Efficient Secure Multi-Party Logistic Regression

Published: 01 Jan 2024, Last Modified: 07 May 2025IEEE Trans. Commun. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Privacy-preserving coded computing is a popular framework for multiple data-owners to jointly train machine learning models, with strong end-to-end information-theoretic privacy guarantees for the local data. A major challenge against the scalability of current approaches is their communication overhead, which is quadratic in the number of users. Towards addressing this challenge, we present SCALR, a communication-efficient collaborative learning framework for training logistic regression models. To do so, we introduce a novel coded computing mechanism, by decoupling the communication-intensive encoding operations from real-time training, and offloading the former to a data-independent offline phase, where the communicated variables are independent from training data. As such, the offline phase can be executed proactively during periods of low network activity. Communication complexity of the data-dependent (online) training operations is only linear in the number of users, greatly reducing the quadratic state-of-the-art. Our theoretical analysis presents the information-theoretic privacy guarantees, and shows that SCALR achieves the same performance guarantees as the state-of-the-art, in terms of adversary resilience, robustness to user dropouts, and model convergence. Through extensive experiments, we demonstrate up to $80\times $ reduction in online communication overhead, and $6\times $ speed-up in the wall-clock training time compared to the state-of-the-art.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview