Federated Learning with Local Openset Noisy Labels

Zonglin Di; Zhaowei Zhu; Xiaoxiao Li; Yang Liu

Federated Learning with Local Openset Noisy Labels

Zonglin Di, Zhaowei Zhu, Xiaoxiao Li, Yang Liu

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Federated Learning, openset, noisy label, heterogenous

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: A more practical but challenging noisy label problem in federated learning and its solution

Abstract: Federated learning is a learning paradigm that allows the central server to learn from different data sources while keeping the data private locally. Without controlling and monitoring the local data collection process, the locally available training labels are likely noisy, $\textit{i.e.}$, the collected training labels differ from the unobservable ground truth. Additionally, in heterogenous FL, each local client may only have access to a subset of label space (referred to as openset label learning), meanwhile without overlapping with others. In this work, we study the challenge of federated learning with local openset noisy labels. We observe that many existing solutions in the noisy label literature, $\textit{e.g.}$, loss correction, are ineffective during local training due to overfitting to noisy labels and being not generalizable to openset labels. To address the problems, we design a label communication mechanism that shares randomly selected ``contrastive labels" among clients. The privacy of the shared contrastive labels is protected by label differential privacy (DP). Both the DP guarantee and the effectiveness of our approach are theoretically guaranteed. Compared with several baseline methods, our solution shows its efficiency in several public benchmarks and real-world datasets under different noise ratios and noise models.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6992

Loading