CAESAR: Enhancing Federated RL in Heterogeneous MDPs through Convergence-Aware Sampling with Screening

Published: 13 Mar 2024, Last Modified: 22 Apr 2024ALA 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement learning, federated reinforcement learning, heterogeneous environments.
TL;DR: This paper introduces CAESAR, a novel Federated Reinforcement Learning framework, designed to enhance the learning of individual agents across heterogeneous MDPs.
Abstract: In this study, we delve into Federated Reinforcement Learning (FedRL) in the context of value-based agents operating across diverse Markov Decision Processes (MDPs). Existing FedRL methods typically aggregate agents' learning by averaging the value functions across them to improve their performance. However, this aggregation strategy is suboptimal in heterogeneous environments where agents converge to diverse optimal value functions. To address this problem, we introduce the Convergence-AwarE SAmpling with scReening (CAESAR) aggregation scheme designed to enhance the learning of individual agents across varied MDPs. CAESAR is an aggregation strategy used by the server that combines convergence-aware sampling with a screening mechanism. By exploiting the fact that agents learning in identical MDPs are converging to the same optimal value function, CAESAR enables the selective assimilation of insights from more proficient counterparts, thereby significantly enhancing the overall learning efficiency. We empirically validate our hypothesis and demonstrate the effectiveness of CAESAR in enhancing the learning efficiency of agents, using both a custom-built GridWorld environment and the classical FrozenLake-v1 task, each presenting varying levels of environmental heterogeneity.
Type Of Paper: Full paper (max page 8)
Anonymous Submission: Anonymized submission.
Submission Number: 10
Loading