Converge Faster, Talk Less: Hessian-Informed Federated Zeroth-Order Optimization

Converge Faster, Talk Less: Hessian-Informed Federated Zeroth-Order Optimization

ICLR 2026 Conference Submission13051 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Zeroth-Order Optimization, Federated Optimization, Hessian

Abstract: Zeroth-order (ZO) optimization enables dimension-free communication in federated learning (FL), making it attractive for fine-tuning of large language models (LLMs) due to significant communication savings. However, existing ZO-FL methods largely overlook curvature information, despite its well-established benefits for convergence acceleration. To address this, we propose **HiSo**, a Hessian-informed ZO federated optimization method that accelerates convergence by leveraging global diagonal Hessian approximations, while strictly preserving scalar-only communication **without transmitting any second-order information**. Theoretically, for non-convex functions, we show that HiSo can achieve an accelerated convergence rate that is independent of the Lipschitz constant $L$ and model dimension $d$ under some Hessian approximation assumptions, offering a plausible explanation for the observed phenomenon of ZO convergence being much faster than its worst-case $O(d)$-bound. Empirically, across diverse LLM fine-tuning benchmarks, HiSo delivers a 1$\sim$5× speedup in communication rounds over existing state-of-the-art ZO-FL baselines. This superior convergence not only cuts communication costs but also provides strong empirical evidence that Hessian information acts as an effective accelerator in federated ZO optimization settings.

Supplementary Material: zip

Primary Area: optimization

Submission Number: 13051

Loading