Byzantine-Robust Learning on Heterogeneous Datasets via Resampling

Lie He; Sai Praneeth Karimireddy; Martin Jaggi

Byzantine-Robust Learning on Heterogeneous Datasets via Resampling

Lie He, Sai Praneeth Karimireddy, Martin Jaggi

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Byzantine robustness, distributed training, heterogeneous dataset

Abstract: In Byzantine-robust distributed optimization, a central server wants to train a machine learning model over data distributed across multiple workers. However, a fraction of these workers may deviate from the prescribed algorithm and send arbitrary messages to the server. While this problem has received significant attention recently, most current defenses assume that the workers have identical data distribution. For realistic cases when the data across workers are heterogeneous (non-iid), we design new attacks that circumvent these defenses leading to significant loss of performance. We then propose a universal resampling scheme that addresses data heterogeneity at a negligible computational cost. We theoretically and experimentally validate our approach, showing that combining resampling with existing robust algorithms is effective against challenging attacks.

One-sentence Summary: In this paper, we studied robust distributed learning problem under realistic heterogeneous data and proposed a general resampling technique which greatly improves the current robust aggregation rules on heterogeneous data.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/byzantine-robust-learning-on-heterogeneous/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=lour7CgtR

11 Replies

Loading