FineFed: Forward-Only Federated Fine-Tuning for Many-Class Tasks under Non-IID Heterogeneity

Hao Lan; Zhengguo Liu; Qing Wang; Jiwu Shu

FineFed: Forward-Only Federated Fine-Tuning for Many-Class Tasks under Non-IID Heterogeneity

Hao Lan, Zhengguo Liu, Qing Wang, Jiwu Shu

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Federated Learing, Forward Gradient, Zeroth-Order, Parameter-Efficient Fine-Tuning

TL;DR: FineFed is a forward-only FL framework with shared momentum, uncertainty-guided forward gradients, and forward-only head tuning, accelerating convergence and reducing compute, memory, and communication across vision/NLP benchmarks under non-IID data.

Abstract: Federated learning (FL) on resource-constrained edge devices faces significant challenges when training large transformer models, particularly due to memory and computational limitations. While parameter-efficient fine-tuning (PEFT) methods help reduce memory usage, they still require back-propagation for gradient computation, which often demands more memory than storing model parameters. Forward-gradient (zero-order) FL offers a promising alternative by eliminating back-propagation, but existing methods suffer from computational inefficiency, poor performance on many-class tasks, and unstable convergence under non-IID data distributions. We present \emph{FineFed}, an efficient forward-only FL framework that addresses these limitations through three key innovations: (i) \textbf{Forward-Only Head Tuning}, which enables exact gradient computation for many-class classification heads without back-propagation; (ii) \textbf{Uncertainty-Guided Forward Gradient Estimation}, which reduces computational cost by approximately $2.5\times$ via uncertainty-guided sample selection and micro-batch perturbations; and (iii) \textbf{Shared Momentum}, which ensures stable local updates and fast convergence under extreme non-IID data heterogeneity. Comprehensive evaluations across NLP and vision datasets demonstrate that FineFed achieves superior model accuracy and system efficiency compared to state-of-the-art methods, making forward-only federated learning practical for real-world deployment. Our code is available at \url{https://anonymous.4open.science/r/FineFed-0554/}.

Primary Area: optimization

Submission Number: 9100

Loading