DECENTRALIZED MULTI-AGENT REINFORCEMENT LEARNING VIA ANTICIPATION SHARING

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: multi-agent reinforcement learning, decentralized learning, cooperative multi-agent learning, social welfare
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: In the realm of cooperative and decentralized multi-agent reinforcement learning (MARL), a fundamental challenge is reconciling individual incentives with collective outcomes. Previous studies often use algorithms where agents share rewards, values or policy models to align individual and collective goals. However, these methods pose issues like policy discoordination, privacy concerns, and considerable communication overheads. In this research, we obviate the need for sharing rewards, values, or model parameters. To bridge the gap between individual and collective goals, we set up a personal target based on a collective objective. This involves comparing what each agent anticipates other agents to do with what those agents intend to do. We introduce a novel decentralized MARL method based on the idea - Anticipation Sharing - where local agents update their anticipations regarding the action distributions of neighboring agents, reflecting their preferences, and share them with the corresponding agents. Based on the anticipations, each agent rectifies the deviation of its individual policy from the collective cooperation objective. Our approach has been validated as effective and viable through both theoretical analysis and testing in simulated environments. Our study shows the proposed MARL framework can induce cooperative behaviors among agents, even when they have private information about rewards, policies, and values. This represents a paradigm shift in orchestrating effective ways of cooperation by explicitly reconciling both individual and collective interests within multi-agent systems.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5115
Loading