Toward Robust Agents: A Survey of Adversarial Attacks and Defenses in Deep Reinforcement Learning

Adithya Mohan, Torsten Schön

Published: 2026, Last Modified: 26 Feb 2026IEEE Access 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Deep Reinforcement Learning (DRL) has demonstrated remarkable success in autonomous decision-making across diverse domains, including robotics, autonomous driving, and game playing. However, recent studies have uncovered a critical vulnerability: DRL agents are highly susceptible to adversarial attacks that can significantly degrade their performance or lead to catastrophic failure. These attacks exploit different components of the learning pipeline observations, actions, rewards, and policies exposing new challenges unique to DRL compared to supervised learning. This survey provides a comprehensive examination of adversarial threats and corresponding defense mechanisms within the DRL paradigm. This also aims to serve as a foundational reference for researchers and practitioners seeking to understand and mitigate adversarial vulnerabilities in DRL.

External IDs:dblp:journals/access/MohanS26