Stress Testing Byzantine Robustness in Distributed Learning

Youssef Allouah; Rachid Guerraoui; Nirupam Gupta; Haowen Liu; Rafael Pinot; Geovani Rizk

Stress Testing Byzantine Robustness in Distributed Learning

Youssef Allouah, Rachid Guerraoui, Nirupam Gupta, Haowen Liu, Rafael Pinot, Geovani Rizk

16 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: societal considerations including fairness, safety, privacy

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Byzantine robustness, distributed machine learning, attacks

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Byzantine robustness in distributed learning consists in ensuring that distributed optimization algorithms, such as distributed SGD, are robust to arbitrarily malicious participants, also called Byzantine workers. Essentially, such workers attack the algorithm to prevent it from delivering a good model, by sharing erroneous information. Several defenses have been proposed so far, typically with theoretical worst-case robustness analyses. Yet, these analyses only show convergence to critical points up to large constants, which provides a false sense of security in the absence of a strong attack benchmark. We contribute to addressing this shortcoming by modeling an optimal Byzantine adversary in distributed learning, from which we derive Jump, a long-term attack strategy aiming at circumventing the training loss's minima. Interestingly, even if Jump is a solution to a simplified form of the optimal adversary's problem, it is very powerful: even the greedy version of Jump can satisfactorily break existing defenses. We systematically evaluate state-of-the-art attacks and defenses on MNIST and CIFAR-10 under data heterogeneity, and show that Jump consistently performs better or comparably to other attacks. For example, on CIFAR-10, Jump doubles the accuracy damage from 66% accuracy, across existing attacks, to 50% on average, compared to 81% without attack under moderate data heterogeneity. Hence, we encourage the usage of Jump as a stress test of Byzantine robustness in distributed learning.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: zip

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 698

Loading