PIMAEX: Multi-Agent Exploration Through Peer Incentivization

Published: 01 Jan 2025, Last Modified: 15 May 2025ICAART (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: While exploration in single-agent reinforcement learning has been studied extensively in recent years, consid-erably less work has focused on its counterpart in multi-agent reinforcement learning. To address this issue, this work proposes a peer-incentivized reward function inspired by previous research on intrinsic curiosity and influence-based rewards. The PIMAEX reward, short for Peer-Incentivized Multi-Agent Exploration, aims to improve exploration in the multi-agent setting by encouraging agents to exert influence over each other to increase the likelihood of encountering novel states. We evaluate the PIMAEX reward in conjunction with PIMAEX-Communication, a multi-agent training algorithm that employs a communication channel for agents to influence one another. The evaluation is conducted in the Consume/Explore environment, a partially observable environment with deceptive rewards, specifically designed to challenge the exploration vs. exploitation dilemma and the credit-assignm
Loading