Recovering Exact Support in Federated lasso without Optimization

Published: 16 Feb 2024, Last Modified: 16 Feb 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Federated learning provides a framework to address the challenges of distributed computing, data ownership, and privacy over a large number of distributed clients with low computational and communication capabilities. In this paper, we study the problem of learning the exact support of sparse linear regression in the federated learning setup. We provide a simple communication efficient algorithm that only needs one-shot communication with the centralized server to compute the exact support by majority voting. Our method does not require the clients to solve any optimization problem and thus, can be run on devices with low computational capabilities. Our method is naturally robust to the problems of client failure, model poisoning, and straggling clients. We formally prove that our method requires a number of samples per client that is polynomial with respect to the support size, but independent of the dimension of the problem. We require the number of distributed clients to be logarithmic in the dimension of the problem. For certain classes of predictor variables (e.g. mutually independent, correlated Gaussian, etc.), the overall sample complexity matches the optimal sample complexity of the non-federated centralized setting. Furthermore, our method is easy to implement and has an overall polynomial time complexity.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: 1543
Changes Since Last Submission: Dear AC, we have updated our manuscript: - We have added an explanatory note below Lemma 2 regarding having a small proportion of clients $i$ for which $\sigma^i_{jj}=0$. - Section 8.3: We added this subsection to test different proportions of straggling clients. - Section 8.4: We added this subsection to compare with centralized lasso in support recovery and runtime. - Section 9.5: We added this subsection to test different levels of regularization in the real world experiment. - Appendix: We removed statements of appendix lemmas inside proofs of main-text lemmas, and moved appendix lemmas (statement and proof) before the proofs of main-text lemmas. Regarding the old numbering of lemmas: we joined Lemma 8 with 10, Lemma 9 with 11, Lemma 12 with 13. We merged Lemma 15 inside 14, and Lemma 17 inside 16. - Introduction: We added the sentence "Support recovery in sparse models is of great importance in machine learning as it relates to feature selection." - We corrected all the other typos found by the reviewers. - Unfortunately, due to my overseas moving, we have not been able to implement these changes yet: cleaning up Section 6 and the minor comment: "Use a different notation than $e_i$ for noise (e.g., $Z_i$)."
Supplementary Material: pdf
Assigned Action Editor: ~Zachary_B._Charles1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 1568