Federated Ensemble-Directed Offline Reinforcement Learning

Desik Rengarajan; Nitin Ragothaman; Dileep Kalathil; Srinivas Shakkottai

Federated Ensemble-Directed Offline Reinforcement Learning

Desik Rengarajan, Nitin Ragothaman, Dileep Kalathil, Srinivas Shakkottai

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep Reinforcement Learning, Offline Reinforcement Learning, Federated Learning

TL;DR: A novel federated offline reinforcement learning algorithm

Abstract: We consider the problem of federated offline reinforcement learning (RL), a scenario under which distributed learning agents must collaboratively learn a high-quality control policy only using small pre-collected datasets generated according to different unknown behavior policies. Na\"{i}vely combining a standard offline RL approach with a standard federated learning approach to solve this problem can lead to poorly performing policies. In response, we develop the Federated Ensemble-Directed Offline Reinforcement Learning Algorithm (FEDORA), which distills the collective wisdom of the clients using an ensemble learning approach. We develop the FEDORA codebase to utilize distributed compute resources on a federated learning platform. We show that FEDORA significantly outperforms other approaches, including offline RL over the combined data pool, in various complex continuous control environments and real-world datasets. Finally, we demonstrate the performance of FEDORA in the real-world on a mobile robot. We provide our code and a video of our experiments at \url{https://github.com/DesikRengarajan/FEDORA}.

Supplementary Material: zip

Primary Area: Reinforcement learning

Submission Number: 8054

Loading