Abstract: With the fast development of multi-agent reinforcement learning, communication among agents has become a new research hotspot for its significant role in promoting the cooperation of automated devices. However, in real-world scenarios, agents such as unmanned vehicles and robots are likely to suffer from communication resource constraints, making designing efficient communication protocols essential. In this paper, we propose to quantize messages and reduce discrete entropy to achieve effective multi-agent communication under bandwidth limits. Achieving this goal requires solving two challenges: The first one is that the gradients of discrete entropy remain zero except for several discontinuous points wherein the gradients are undefined, making it hard to reduce discrete entropy via gradient-based training. To overcome it, we design Surrogate Entropy Minimization (SEM) scheme and confirm its effectiveness theoretically. The second challenge is maximizing cooperation performance under a given bandwidth limit. We model it as a constrained optimization problem and design Soft Barrier Method (SBM). Our proposed scheme is evaluated alongside four other methods in six environment settings and five different bandwidth limits, and demonstrates outstanding performance. Specifically, it manages to reduce bandwidth consumption by up to 90% with little or no loss of cooperation performance.
Loading