Generalizing Poincaré Policy Representations in Multi-agent Reinforcement Learning

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: policy representation, reinforcement learning, multi-agent
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Learning policy representations is essential for comprehending the intricacies of agent interactions and their decision-making processes. Recent studies have found that the evolution of any state under Markov decision processes (MDPs) can be divided into multiple hierarchies based on time sequences. This conceptualization resembles a tree-growing process, where the policy and environment dynamics determine the possible branches. In this paper, the multiple agent's trajectory growing paths can be projected into a Poincaré ball, which requires the tree to grow from the origin to the boundary of the ball, deriving a new geometric idea of learning Poincaré Policy Representations (P2R) for MARL. Specifically, P2R captures the policy representation of the Poincaré ball by a hyperbolic neural network and introduces a contrast objective function that encourages embeddings of the same policy to move closer together while embeddings of different policies to move apart, which enables embed policies with low distortion. Experimental results provide empirical evidence for the effectiveness of the P2R framework in cooperative and competitive games, demonstrating the potential of Poincaré policy representations for optimizing policies in complex multi-agent environments.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6972
Loading