Robust Multi-Agent Reinforcement Learning with Stochastic Adversary

Ziyuan Zhou; Guanjun Liu; Mengchu Zhou; Weiran Guo

Robust Multi-Agent Reinforcement Learning with Stochastic Adversary

Ziyuan Zhou, Guanjun Liu, Mengchu Zhou, Weiran Guo

Published: 01 May 2025, Last Modified: 13 Aug 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The performance of models trained by Multi-Agent Reinforcement Learning (MARL) is sensitive to perturbations in observations, lowering their trustworthiness in complex environments. Adversarial training is a valuable approach to enhance their performance robustness. However, existing methods often overfit to adversarial perturbations of observations and fail to incorporate prior information about the policy adopted by their protagonist agent, i.e., the primary one being trained. To address this important issue, this paper introduces Adversarial Training with Stochastic Adversary (ATSA), where the proposed adversary is trained online alongside the protagonist agent. The former consists of Stochastic Director (SDor) and SDor-guided generaTor (STor). SDor performs policy perturbations by minimizing the expected team reward of protagonists and maximizing the entropy of its policy, while STor generates adversarial perturbations of observations by following SDor's guidance. We prove that SDor's soft policy converges to a global optimum according to factorized maximum-entropy MARL and leads to the optimal adversary. This paper also introduces an SDor-STor loss function to quantify the difference between a) perturbations in the agent's policy and b) those advised by SDor. We evaluate our ATSA on StarCraft II tasks and autonomous driving scenarios, demonstrating that a) it is robust against diverse perturbations of observations while maintaining outstanding performance in perturbation-free environments, and b) it outperforms the state-of-the-art methods.

Lay Summary: Multi-agent reinforcement learning enables multiple agents to learn and coordinate through interactions with their environment. However, these systems are highly sensitive to small changes or perturbations in their observations, such as sensor noise or adversarial manipulation. This vulnerability undermines their trustworthiness in real-world applications like autonomous driving or multi-robot systems. To address this, this work proposes Adversarial Training with Stochastic Adversary (ATSA) as a new framework that improves a multi-agent system’s robustness against such perturbations. ATSA jointly trains the main learning agent (the protagonist) and a stochastic adversary composed of two modules: 1) Stochastic Director (SDor) that perturbs the agent's policy, and 2) SDor-guided generaTor (STor) that crafts adversarial observations. The former is trained to minimize the agent’s team reward while maximizing entropy, leading to diverse adversarial behaviors. We show that our approach outperforms existing methods across a range of environments, including StarCraft II and driving simulations. Our findings suggest that training agents with stochastic adversaries improves a multi-agent system’s robustness.

Primary Area: Reinforcement Learning->Multi-agent

Keywords: Multi-Agent Reinforcement Learning, Adversarial Training, Stochastic Adversary, Entropy Maximization

Submission Number: 9974

Loading