An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Tianpei Yang; Weixun Wang; Hongyao Tang; Jianye HAO; Zhaopeng Meng; Hangyu Mao; Dong Li; Wulong Liu; Yingfeng Chen; Yujing Hu; Changjie Fan; Chengwei Zhang

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Tianpei Yang, Weixun Wang, Hongyao Tang, Jianye HAO, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yingfeng Chen, Yujing Hu, Changjie Fan, Chengwei Zhang

Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 PosterReaders: Everyone

Keywords: multiagent reinforcement learning, transfer learning, reinforcement learning, option, succesor representation

Abstract: Transfer Learning has shown great potential to enhance single-agent Reinforcement Learning (RL) efficiency. Similarly, Multiagent RL (MARL) can also be accelerated if agents can share knowledge with each other. However, it remains a problem of how an agent should learn from other agents. In this paper, we propose a novel Multiagent Policy Transfer Framework (MAPTF) to improve MARL efficiency. MAPTF learns which agent's policy is the best to reuse for each agent and when to terminate it by modeling multiagent policy transfer as the option learning problem. Furthermore, in practice, the option module can only collect all agent's local experiences for update due to the partial observability of the environment. While in this setting, each agent's experience may be inconsistent with each other, which may cause the inaccuracy and oscillation of the option-value's estimation. Therefore, we propose a novel option learning algorithm, the successor representation option learning to solve it by decoupling the environment dynamics from rewards and learning the option-value under each agent's preference. MAPTF can be easily combined with existing deep RL and MARL approaches, and experimental results show it significantly boosts the performance of existing methods in both discrete and continuous state spaces.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

TL;DR: We propose a Multiagent Policy Transfer Framework (MAPTF) which allows efficient knowledge transfer among agents. Furthermore, we propose a new option learning method to learn the option-value function under each agent's preference.

Supplementary Material: pdf

14 Replies

Loading