Statistical Optimality of Newton-type Federated Learning with Heterogeneous Data

Statistical Optimality of Newton-type Federated Learning with Heterogeneous Data

ICLR 2026 Conference Submission19042 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Federated learning, Second order optimization, Generalizaton error bounds, Newton's method

TL;DR: This paper proposes a federated Newton method (sharing first-order and second-order information) with generalization error bounds.

Abstract: The mainstream federated learning algorithms only communicate the first-order information across the local devices, i.e., FedAvg and FedProx. However, only using first-order information, these methods are often inefficient and the impact of heterogeneous data is yet not precisely understood. This paper proposes an efficient federated Newton method (FedNewton), by sharing both first-order and second-order knowledge over heterogeneous data. In general kernel ridge regression setting, we derive the generalization bounds for FedNewton and obtain the minimax-optimal learning rates. For the first time, our results analytically quantify the impact of the number of local examples, the data heterogeneity and the model heterogeneity. Moreover, as long as the local sample size is not too small and data heterogeneity is moderate, the federated error in FedNewton decreases exponentially in terms of iterations. Extensive experimental results further validate our theoretical findings and illustrate the advantages of FedNewton over the first-order methods.

Primary Area: learning theory

Submission Number: 19042

Loading