Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks

Shuoguang Yang; Xuezhou Zhang; Mengdi Wang

Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks

Shuoguang Yang, Xuezhou Zhang, Mengdi Wang

Published: 31 Oct 2022, Last Modified: 16 Jan 2023NeurIPS 2022 AcceptReaders: Everyone

Keywords: Decentralized Optimization, Federated Learning, Bilevel Optimization, Compositional Optimization

Abstract: Bilevel optimization have gained growing interests, with numerous applications found in meta learning, minimax games, reinforcement learning, and nested composition optimization. This paper studies the problem of decentralized distributed bilevel optimization over a network where agents can only communicate with neighbors, and gives examples from multi-task, multi-agent learning and federated learning. In this paper, we propose a gossip-based distributed bilevel learning algorithm that allows networked agents to solve both the inner and outer optimization problems in a single timescale and share information through network propagation. We show that our algorithm enjoys the $\mathcal{O}(\frac{1}{K \epsilon^2})$ per-agent sample complexity for general nonconvex bilevel optimization and $\mathcal{O}(\frac{1}{K \epsilon})$ for Polyak-Łojasiewicz objective, achieving a speedup that scales linearly with the network size $K$. The sample complexities are optimal in both $\epsilon$ and $K$. We test our algorithm on the examples of hyperparameter tuning and decentralized reinforcement learning. Simulated experiments confirmed that our algorithm achieves the state-of-the-art training efficiency and test accuracy.

TL;DR: This paper proposes an efficient algorithm for solving stochastic bilevel optimization in decentralized networks and provides its theoretical performance guarantees in nonconvex and strongly convex regimes.

Supplementary Material: pdf

13 Replies

Loading