When Can Federated Learning Match Centralized Learning? A PAC-Bayesian Generalization Gap Analysis
Abstract: The growing focus on distributed data and privacy has spurred the rise of Federated Learning (FL). Empirical studies show that, under equal resources, FL often underperforms centralized training, but the reasons behind this gap remain theoretically unclear. This lack of understanding leaves open whether FL is inherently inferior in generalization and how the gap might be closed. We address this by formulating FL as a server-based SGD optimization problem over distributed data and analyzing the generalization gap within the PAC-Bayesian framework. Our analysis derives non-vacuous bounds on this gap, showing that such a gap necessarily exists under equal resources and depends on training parameters. We further prove that the gap can be fully eliminated only by introducing new clients or adding new data to existing clients, with the latter being more efficient. In contrast, allowing FL to have advantages in other resources, such as larger models or more communication rounds, cannot close the gap. As a complementary analysis, we also confirm from a stability perspective that centralized FL holds a generalization advantage over decentralized FL, justifying our FL formulation choice. Extensive experiments across different model architectures and datasets validate our theory.
Submission Number: 160
Loading