Dynamic layer-wise sparsification for distributed deep learning

Hao Zhang, Tingting Wu, Zhifeng Ma, Feng Li, Jie Liu

Published: 01 Jan 2023, Last Modified: 06 Feb 2025Future Gener. Comput. Syst. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•An inevitable gap exists between theoretical and practical Top-k sparsification.•DLS alters the sparsity ratio of each layer during the model training.•DLS is with both good performance and high training efficiency.•DLS(s) further reduces introduced overhead without performance degradation.•The performance is evaluated on four datasets and a wide variety of models.