Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning

Zhang-Wei Hong; Prabhat Nagarajan; Guilherme Maeda

Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning

Zhang-Wei Hong, Prabhat Nagarajan, Guilherme Maeda

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Abstract: Reinforcement Learning (RL) has demonstrated promising results across several sequential decision-making tasks. However, reinforcement learning struggles to learn efficiently, thus limiting its pervasive application to several challenging problems. A typical RL agent learns solely from its own trial-and-error experiences, requiring many experiences to learn a successful policy. To alleviate this problem, we propose collaborative inter-agent knowledge distillation (CIKD). CIKD is a learning framework that uses an ensemble of RL agents to execute different policies in the environment while sharing knowledge amongst agents in the ensemble. Our experiments demonstrate that CIKD improves upon state-of-the-art RL methods in sample efficiency and performance on several challenging MuJoCo benchmark tasks. Additionally, we present an in-depth investigation on how CIKD leads to performance improvements.

Keywords: Reinforcement learning, distillation

Original Pdf: pdf

12 Replies

Loading