LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning

David Henry Mguni; Taher Jafferjee; Jianhong Wang; Nicolas Perez-Nieves; Oliver Slumbers; Feifei Tong; Yang Li; Jiangcheng Zhu; Yaodong Yang; Jun Wang

LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning

David Henry Mguni, Taher Jafferjee, Jianhong Wang, Nicolas Perez-Nieves, Oliver Slumbers, Feifei Tong, Yang Li, Jiangcheng Zhu, Yaodong Yang, Jun Wang

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 PosterReaders: Everyone

Keywords: multi-agent, reinforcement learning, intrinsic rewards, exploration

Abstract: Efficient exploration is important for reinforcement learners (RL) to achieve high rewards. In multi-agent systems, coordinated exploration and behaviour is critical for agents to jointly achieve optimal outcomes. In this paper, we introduce a new general framework for improving coordination and performance of multi-agent reinforcement learners (MARL). Our framework, named Learnable Intrinsic-Reward Generation Selection algorithm (LIGS) introduces an adaptive learner, Generator that observes the agents and learns to construct intrinsic rewards online that coordinate the agents’ joint exploration and joint behaviour. Using a novel combination of reinforcement learning (RL) and switching controls, LIGS determines the best states to learn to add intrinsic rewards which leads to a highly efficient learning process. LIGS can subdivide complex tasks making them easier to solve and enables systems of RL agents to quickly solve environments with sparse rewards. LIGS can seamlessly adopt existing multi-agent RL algorithms and our theory shows that it ensures convergence to joint policies that deliver higher system performance. We demonstrate the superior performance of the LIGS framework in challenging tasks in Foraging and StarCraft II and show LIGS is capable of tackling tasks previously unsolvable by MARL methods.

Supplementary Material: zip

12 Replies

Loading