Keywords: byzantine-robust optimization, federated learning, generalized smoothness, normalized SGD
TL;DR: We introduce Byz-NSGDM, a normalized momentum SGD method robust to Byzantine attacks and $(L_0,L_1)$-smoothness, achieving $O(K^{-1/4})$ convergence and outperforming prior methods on heterogeneous MNIST and synthetic tasks.
Abstract: We consider distributed optimization under Byzantine attacks in the presence of $(L_0,L_1)$-smoothness, a generalization of standard $L$-smoothness that captures functions with state-dependent gradient Lipschitz constants. We propose $\texttt{Byz-NSGDM}$, a normalized stochastic gradient descent method with momentum that achieves robustness against Byzantine workers while maintaining convergence guarantees. Our algorithm combines momentum normalization with Byzantine-robust aggregation enhanced by Nearest Neighbor Mixing (NNM) to handle both the challenges posed by $(L_0,L_1)$-smoothness and Byzantine adversaries. We prove that $\texttt{Byz-NSGDM}$ achieves a convergence rate of $O(K^{-1/4})$ up to a Byzantine bias floor proportional to the robustness coefficient and gradient heterogeneity. Experimental validation on heterogeneous MNIST classification and synthetic $(L_0,L_1)$-smooth optimization problems demonstrates the effectiveness of our approach against various Byzantine attack strategies.
Submission Number: 100
Loading