State-Aware Perturbation Optimization for Robust Deep Reinforcement Learning

Zongyuan Zhang, Tianyang Duan, Zheng Lin, Dong Huang, Zihan Fang, Zekai Sun, Ling Xiong, Hongbin Liang, Heming Cui, Yong Cui

Published: 2026, Last Modified: 19 Mar 2026IEEE Trans. Mob. Comput. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recently, deep reinforcement learning (DRL) has emerged as a promising approach for robotic control. However, the deployment of DRL in real-world robots is hindered by its sensitivity to environmental perturbations. While existing white-box adversarial attacks rely on local gradient information and apply uniform perturbations across all states to evaluate DRL robustness, they fail to account for temporal dynamics and state-specific vulnerabilities. To combat the above challenge, we first conduct a theoretical analysis of white-box attacks in DRL by establishing the Adversarial Victim Dynamics Markov Decision Process (AVD-MDP), to derive the necessary and sufficient conditions for a successful attack. Based on this, we propose the Selective State-Aware Reinforcement adversarial attack (STAR), to optimize perturbation stealthiness and state visitation dispersion. STAR first employs a soft mask-based state-targeting mechanism to minimize redundant perturbations, enhancing stealthiness and attack effectiveness. Then, it incorporates an information-theoretic optimization objective to maximize mutual information between perturbations, environmental states, and victim actions, ensuring a dispersed state-visitation distribution that steers the victim agent into vulnerable states for maximum return reduction. Extensive experiments demonstrate that STAR outperforms state-of-the-art benchmarks.
Loading