Improving Adversarial Training for Two-player Competitive Games via Episodic Reward Engineering

Improving Adversarial Training for Two-player Competitive Games via Episodic Reward Engineering

TMLR Paper4914 Authors

22 May 2025 (modified: 06 Aug 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Training adversarial agents to attack neural network policies has proven to be both effective and practical. However, we observe that existing methods can be further enhanced by distinguishing between states leading to win or lose and encouraging the policy training by reward engineering to prioritize winning states. In this paper, we introduce a novel adversarial training method with reward engineering for two-player competitive games. Our method extracts the historical evaluations for states from historical experiences with an episodic memory, and then incorporating these evaluations into the rewards with our proposed reward revision method to improve the adversarial policy optimization. We evaluate our approach using two-player competitive games in MuJoCo simulation environments, demonstrating that our method establishes the most promising attack performance and defense difficulty against the victims among the existing adversarial policy training techniques.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Tongzheng_Ren1

Submission Number: 4914

Loading