Federated Ensemble-Directed Offline Reinforcement Learning

Keywords: Offline Reinforcement Learning, Federated Learning
TL;DR: FEDORA: A general federated offline RL algorithm for clients with heterogeneous data
Abstract: We consider the problem of federated offline reinforcement learning (RL), where clients must collaboratively learn a control policy only using data collected using unknown behavior policies. Naively combining a standard offline RL approach with a standard federated learning approach to solve this problem can lead to poorly performing policies. We develop Federated Ensemble-Directed Offline Reinforcement Learning Algorithm (FEDORA), which distills the collective wisdom of the clients using an ensemble learning approach. We show that FEDORA significantly outperforms other approaches, including offline RL over the combined data pool, in various complex continuous control and real-world environments.
