Certifiably Robust Policy Learning against Adversarial Multi-Agent Communication

Yanchao Sun; Ruijie Zheng; Parisa Hassanzadeh; Yongyuan Liang; Soheil Feizi; Sumitra Ganesh; Furong Huang

Certifiably Robust Policy Learning against Adversarial Multi-Agent Communication

Yanchao Sun, Ruijie Zheng, Parisa Hassanzadeh, Yongyuan Liang, Soheil Feizi, Sumitra Ganesh, Furong Huang

Published: 01 Feb 2023, Last Modified: 04 Aug 2025ICLR 2023 posterReaders: Everyone

Keywords: certifiable robustness, reinforcement learning, multi-agent system, adversarial communication, adversarial attack

TL;DR: We propose a defense method such that an agent receiving communication in an multi-agent system can be certifiably robust when a subset of communication messages get (arbitrarily) perturbed.

Abstract: Communication is important in many multi-agent reinforcement learning (MARL) problems for agents to share information and make good decisions. However, when deploying trained communicative agents in a real-world application where noise and potential attackers exist, the safety of communication-based policies becomes a severe issue that is underexplored. Specifically, if communication messages are manipulated by malicious attackers, agents relying on untrustworthy communication may take unsafe actions that lead to catastrophic consequences. Therefore, it is crucial to ensure that agents will not be misled by corrupted communication, while still benefiting from benign communication. In this work, we consider an environment with $N$ agents, where the attacker may arbitrarily change the communication from any $C<\frac{N-1}{2}$ agents to a victim agent. For this strong threat model, we propose a certifiable defense by constructing a message-ensemble policy that aggregates multiple randomly ablated message sets. Theoretical analysis shows that this message-ensemble policy can utilize benign communication while being certifiably robust to adversarial communication, regardless of the attacking algorithm. Experiments in multiple environments verify that our defense significantly improves the robustness of trained policies against various types of attacks.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/certifiably-robust-policy-learning-against/code)

16 Replies

Loading