Deep Reinforcement Learning Agents are not even close to Human Intelligence

Published: 22 Jun 2025, Last Modified: 27 Jul 2025IBRL @ RLC 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generalization, Shortcut Learning, Misalignment
TL;DR: We show that many deep and symbolic RL algorithms consistently produce agents that learn shortcuts, and thus cannot generalize to tasks simplifications.
Abstract: Deep reinforcement learning agents achieve impressive results in a wide variety of tasks, but they lack zero-shot adaptation capabilities. While most robustness evaluations focus on tasks complexifications, for which human also struggle to maintain performances, no evaluation has been performed on tasks simplifications. To tackle this issue, we introduce HackAtari, a set of task variations of the Arcade Learning Environments. We use it to demonstrate that, contrary to humans, RL agents systematically exhibit huge performance drops on simpler versions of their training tasks, uncovering agents' consistent reliance on shortcuts. Our analysis across multiple algorithms and architectures highlights the persistent gap between RL agents and human behavioral intelligence, underscoring the need for new benchmarks and methodologies that enforce systematic generalization testing. It demonstrates the need to integrate more human inductive bias to achieve truly intelligent agents.
Submission Number: 4
Loading