Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient Updates

Guangchen Lan; Han Wang; James Anderson; Christopher Brinton; Vaneet Aggarwal

Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient Updates

Guangchen Lan, Han Wang, James Anderson, Christopher Brinton, Vaneet Aggarwal

Published: 21 Sept 2023, Last Modified: 19 Dec 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: reinforcement learning, federated learning

TL;DR: We propose a federated reinforcement learning method based on ADMM. The communication complexity of each agent is reduced from $O(d^2)$ to $O(d)$, where $d$ is the number of model parameters.

Abstract: Federated reinforcement learning (FedRL) enables agents to collaboratively train a global policy without sharing their individual data. However, high communication overhead remains a critical bottleneck, particularly for natural policy gradient (NPG) methods, which are second-order. To address this issue, we propose the FedNPG-ADMM framework, which leverages the alternating direction method of multipliers (ADMM) to approximate global NPG directions efficiently. We theoretically demonstrate that using ADMM-based gradient updates reduces communication complexity from $\mathcal{O}({d^{2}})$ to $\mathcal{O}({d})$ at each iteration, where $d$ is the number of model parameters. Furthermore, we show that achieving an $\epsilon$-error stationary convergence requires $\mathcal{O}(\frac{1}{(1-\gamma)^{2}{\epsilon}})$ iterations for discount factor $\gamma$, demonstrating that FedNPG-ADMM maintains the same convergence rate as standard FedNPG. Through evaluation of the proposed algorithms in MuJoCo environments, we demonstrate that FedNPG-ADMM maintains the reward performance of standard FedNPG, and that its convergence rate improves when the number of federated agents increases.

Supplementary Material: zip

Submission Number: 1340

Loading