When Can Federated Learning Match Centralized Learning? A PAC-Bayesian Generalization Gap Analysis

Xuanyu Chen; Shuai Wang; NAN YANG; Dong Yuan

When Can Federated Learning Match Centralized Learning? A PAC-Bayesian Generalization Gap Analysis

Xuanyu Chen, Shuai Wang, NAN YANG, Dong Yuan

Published: 03 Feb 2026, Last Modified: 02 May 2026AISTATS 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The growing focus on distributed data and privacy has spurred the rise of Federated Learning (FL). Empirical studies show that, under equal resources, FL often underperforms centralized training, but the reasons behind this gap remain theoretically unclear. This lack of understanding leaves open whether FL is inherently inferior in generalization and how the gap might be closed. We address this by formulating FL as a server-based SGD optimization problem over distributed data and analyzing the generalization gap within the PAC-Bayesian framework. Our analysis derives non-vacuous bounds on this gap, showing that such a gap necessarily exists under equal resources and depends on training parameters. We further prove that the gap can be fully eliminated only by introducing new clients or adding new data to existing clients, with the latter being more efficient. In contrast, allowing FL to have advantages in other resources, such as larger models or more communication rounds, cannot close the gap. As a complementary analysis, we also confirm from a stability perspective that centralized FL holds a generalization advantage over decentralized FL, justifying our FL formulation choice. Extensive experiments across different model architectures and datasets validate our theory.

Code Dataset Promise: Yes

Signed Copyright Form: pdf

Format Confirmation: I agree that I have read and followed the formatting instructions for the camera ready version.

Submission Number: 160

Loading