A limitation on black-box dynamics approaches to Reinforcement Learning

Brieuc Pinon; Raphael Jungers; Jean-Charles Delvenne

A limitation on black-box dynamics approaches to Reinforcement Learning

Brieuc Pinon, Raphael Jungers, Jean-Charles Delvenne

Published: 25 Mar 2025, Last Modified: 25 Mar 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: We prove a fundamental limitation on the computational efficiency of a large class of Reinforcement Learning (RL) methods. This limitation applies to model-free RL methods as well as some model-based methods, such as AlphaZero. We provide a formalism that describes this class and present a family of RL problems provably intractable for these methods. Conversely, the problems in the family can be efficiently solved by toy methods. We identify several types of algorithms proposed in the literature that can avoid our limitation, including algorithms that construct an inverse dynamics model, and planning algorithms that leverage an explicit model of the dynamics.

Submission Length: Long submission (more than 12 pages of main content)

Supplementary Material: zip

Assigned Action Editor: ~Matteo_Papini1

Submission Number: 3246

Loading