Communication-Efficient Adaptive Federated Bi-level Optimization with Data and System Heterogeneity

TMLR Paper6655 Authors

26 Nov 2025 (modified: 04 Dec 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Bilevel optimization is a popular nested optimization model in machine learning. Federated bilevel optimization, which extends bilevel optimization to the Federated Learning setting, faces challenges such as complex nested sub-loops, high communication overhead, and a lack of adaptive mechanisms. To address these issues, this paper proposes an Adaptive Single-loop Federated Bilevel Optimization algorithm (ASFBO) in the presence of both data heterogeneity (Non-IID client data) and system heterogeneity (partial client participation per round and varying numbers of local iterations). By replacing nested sub-iterations with a single-loop architecture, ASFBO significantly reduces communication frequency and computational costs. It employs multiple adaptive learning rate variables to dynamically adjust the step sizes of upper-level variable updates, thereby speeding up the algorithm's convergence. Furthermore, a locally accelerated version of the algorithm (LA-ASFBO) that incorporates momentum-based variance reduction techniques is proposed to mitigate hyper-gradient estimation bias across distributed nodes effectively. Theoretical analysis shows that, under the classic setting of a non-convex upper-level and strongly convex lower-level, ASFBO and LA-ASFBO achieve convergence to an $\epsilon$-stationary point with only $\tilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity and $\tilde{\mathcal{O}}(\epsilon^{-1})$ communication complexity. Experiments on federated hyper-representation learning tasks demonstrate the superiority of the proposed algorithm.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Nicolas_Loizou1
Submission Number: 6655
Loading