Finite- Time Analysis of Asynchronous Multi-Agent TD Learning

Nicolò Dal Fabbro, Arman Adibi, Aritra Mitra, George J. Pappas

Published: 01 Jan 2024, Last Modified: 14 May 2025ACC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recent research endeavours have theoretically shown the beneficial effect of cooperation in multi-agent re-inforcement learning (MARL). In a setting involving $N$ agents, this beneficial effect usually comes in the form of an $N$ -fold linear convergence speedup, i.e., a reduction - proportional to $N$ - in the number of iterations required to reach a certain convergence precision. In this paper, we show for the first time that this speedup property also holds for a MARL framework subject to asynchronous delays in the local agents' updates. In particular, we consider a policy evaluation problem in which multiple agents cooperate to evaluate a common policy by communicating with a central aggregator. In this setting, we study the finite-time convergence of AsyncMATD, an asynchronous multi-agent temporal difference (TD) learning algorithm in which agents' local TD update directions are subject to asynchronous bounded delays. Our main contribution is providing a finite-time analysis of AsyncMATD, for which we establish a linear convergence speedup while highlighting the effect of time-varying asynchronous delays on the resulting convergence rate.