HiSo: Efficient Federated Zeroth-Order Optimization via Hessian-Informed Acceleration and Scalar-Only Communication
Keywords: Federated Optimization, Hessian, Zeroth-Order Optimization, LLM Fine-Tuning
Abstract: Recent Federated Learning (FL) with dimension-free communication significantly reduce communication by transmitting only scalars via zeroth-order stochastic gradient descent (ZO-SGD), making them well-suited for federated fine-tuning of Large Language Models (LLMs). Yet, the high variance in ZO gradient estimation slows convergence. While Hessian information can accelerate convergence, integrating it into FL is challenging due to clients' restrictions on local data and the need to maintain the dimension-free communication. To address this, we first introduce a generalized scalar-only communication FL framework decoupling dimension-free communication from standard ZO-SGD, enabling the integration of advanced optimizers. Based on this, we propose HiSo, a new FL method via Hessian-informed zeroth-order optimization and Scalar-only communication. Specifically, it uses global curvature to accelerate convergence while retaining the minimal communication. Theoretically, we establish convergence guarantees independent of Lipschitz $L$ and model dimension $d$.
Submission Number: 16
Loading