FedX: Federated Learning for Compositional Pairwise Risk OptimizationDownload PDF

Published: 01 Feb 2023, Last Modified: 12 Mar 2024Submitted to ICLR 2023Readers: Everyone
Keywords: Federated Learning, Pairwise Loss, Compositional Optimization
TL;DR: A communication-efficient algorithm to optimize pairwise
Abstract: In this paper, we tackle a novel federated learning (FL) problem for optimizing a family of compositional pairwise risks, to which no existing FL algorithms are applicable. In particular, the objective has the form of $\E_{\z\sim \mathcal S_1} f(\E_{\z'\sim\mathcal S_2} \ell(\w; \z, \z'))$, where two sets of data $\mathcal S_1, \mathcal S_2$ are distributed over multiple machines, $\ell(\cdot; \cdot,\cdot)$ is a pairwise loss that only depends on the prediction outputs of the input data pairs $(\z, \z')$, and $f(\cdot)$ is possibly a non-linear non-convex function. This problem has important applications in machine learning, e.g., AUROC maximization with a pairwise loss, and partial AUROC maximization with a compositional loss, etc. The challenges for designing a FL algorithm lie at the non-decomposability of the objective over multiple machines and the interdependency between different machines. We propose two provable FL algorithms (FedX) for handling linear and nolinear $f$, respectively. To tackle the challenges, we decouple the gradient's components with two types namely active parts and lazy parts, where the {\it active} parts depend on local data that can be computed with the local model and the {\it lazy} parts depend on other machines that are communicated/computed based on historical models. We develop a novel theoretical analysis to address the issue of latency of lazy parts and interdependency between the local gradient estimators and the involved data. We establish both iteration and communication complexities and exhibit that using the historical models for computing the lazy parts do not degrade the complexity results. We conduct empirical studies of FedX for AUROC and partial AUROC maximization, and demonstrate their performance compared with multiple baselines.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: General Machine Learning (ie none of the above)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2210.14396/code)
9 Replies

Loading