Over-the-Air Federated TD Learning

Nicolo Dal Fabbro; Aritra Mitra; Robert Heath; Luca Schenato; George J. Pappas

Over-the-Air Federated TD Learning

Nicolo Dal Fabbro, Aritra Mitra, Robert Heath, Luca Schenato, George J. Pappas

Published: 12 May 2023, Last Modified: 23 May 2023MLSys-RCLWN 2023Readers: Everyone

Keywords: Cooperative Reinforcement Learning, Federated Learning, Communication-Efficiency, Wireless Networks

TL;DR: We provide the first finite-time convergence analysis for a federated TD learning algorithm subject to noisy wireless communication channels, and establish a linear convergence speedup under Markovian sampling.

Abstract: In recent years, federated learning has been widely studied to speed up various \textit{supervised} learning tasks at the wireless network edge under communication constraints. However, there is a lack of theoretical understanding as to whether similar speedups in sample complexity can be achieved for cooperative reinforcement learning (RL) problems subject to realistic communication models. To that end, we study a federated policy evaluation problem over wireless fading channels where, to update model parameters, a central server aggregates local temporal difference (TD) update directions from $N$ agents via analog over-the-air computation (OAC). We refer to this scheme as \texttt{OAC-FedTD} and provide a rigorous finite-time convergence analysis of its performance that accounts for linear function approximation, Markovian sampling, and channel-induced distortions and noise. Our analysis reveals the impact of the noisy fading channels on the convergence rate and establishes a linear convergence speedup w.r.t. the number of agents. As far as we are aware, this is the first non-asymptotic analysis of a cooperative RL setting under channel effects. Moreover, our proof leads to tighter bounds on the mixing time relative to existing work in federated RL (without channel effects); as such, it can be of independent interest.

5 Replies

Loading