Exploring the Impact of Attacks on Ring AllReduce

Jiayu Wang, Peng Liu, Zehua Guo, Sen Liu, Chao Yao

Published: 2021, Last Modified: 10 Nov 2023APNet 2021Readers: Everyone

Abstract: Distributed Machine Learning (DML) is widely used to accelerate the training of the deep learning model. In DML, Parameter-Server (PS) and Ring AllReduce are two typical architectures. Recently, observing that many works address the security problem in PS, whose performance can be greatly degraded by malicious participation during the training process. However, the robustness of Ring AllReduce, which can solve the communication bandwidth problem in PS, to the malicious participant is still unknown. In this paper, we design a series of experiments to explore the security problem in Ring AllReduce, and reveal it can also suffer from the malicious participant.

0 Replies