Generalization Error Analysis for Attack-Free and Byzantine-Resilient Decentralized Learning With Data Heterogeneity
Abstract: Decentralized learning, which facilitates joint model training across geographically scattered agents, has gained significant attention in the field of signal and information processing in recent years. While the optimization errors of decentralized learning algorithms have been extensively studied, their generalization errors remain relatively under-explored. As the generalization error reflects the scalability of a trained model on unseen data and are crucial in determining the performance of a trained model in real-world applications, understanding the generalization error of decentralized learning is of paramount importance. In this paper, we present fine-grained generalization error analysis for both attack-free and Byzantine-resilient decentralized learning with heterogeneous data as well as under mild assumptions, in contrast to prior studies that consider homogeneous data and/or rely on a stringent bounded stochastic gradient assumption. Our results shed light on the impact of data heterogeneity, model initialization and stochastic gradient noise – factors that have not been closely investigated before – on the generalization error of decentralized learning. We also reveal that Byzantine attacks performed by malicious agents largely affect the generalization error, and their negative impact is inherently linked to the data heterogeneity while remaining independent on the sample size. Numerical experiments on both strongly convex and non-convex tasks are conducted to validate our theoretical findings.
Loading