Efficient Credit Assignment in Cooperative Multi-Agent Reinforcement Learning

TMLR Paper4461 Authors

12 Mar 2025 (modified: 26 Mar 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Cooperative multi-agent reinforcement learning (MARL) algorithms are crucial in addressing real-world challenges wherein multiple agents collaborate to achieve common objectives. The effectiveness of these algorithms hinges on the accurate estimation of agent action values, typically attained through learning joint and individual action values. However, challenges arise due to the credit assignment problem since it is difficult to accurately attribute the global reward to the actions of individual agents, which limits sample efficiency. This paper introduces ECA, an episodic control-based method, to mitigate this limitation by directly evaluating and assigning individual agent credits. ECA leverages episodic memory to store and cluster past interaction experiences between agents and the environment. Building upon these experiences, we introduce an intrinsic reward signal, quantifying the individual agent credits to the joint goal. This proposed reward signal serves as a corrective measure to revise individual action values, thereby improving the accuracy of individual and joint value estimations. We evaluate our methodology on StarCraft multi-agent challenge (SMAC) and Google Research Football (GRF) tasks, demonstrating that our method significantly improves the sample efficiency of state-of-the-art cooperative MARL algorithms.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=N2CbOrJp3N
Changes Since Last Submission: Dear TMLR Editorial Team, We sincerely apologize for the inconvenience and time wasted, and appreciate the time and effort the reviewers and editors have invested in evaluating our previous submission. We understand that our manuscript was rejected due to inconsistencies with the TMLR paper template. Specifically, we find that some 3rd-party packages in Latex affect the font and format. We have carefully addressed this issue by strictly adhering to the official TMLR template in our resubmission. There have been no substantive changes to the content of the paper beyond this formatting correction. We kindly request the opportunity to resubmit our manuscript for review. Thank you for your consideration. We appreciate the opportunity to resubmit and look forward to your response.
Assigned Action Editor: ~Baoxiang_Wang1
Submission Number: 4461
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview