CURATE: Automatic Curriculum Learning for Reinforcement Learning Agents through Competence-Based Curriculum Policy Search

Tabitha Edith Lee; Nan Rosemary Ke; Sarvesh Patil; Annya Dahmani; Eunice Yiu; Esra'a Saleh; Alison Gopnik; Oliver Kroemer; Glen Berseth

CURATE: Automatic Curriculum Learning for Reinforcement Learning Agents through Competence-Based Curriculum Policy Search

Tabitha Edith Lee, Nan Rosemary Ke, Sarvesh Patil, Annya Dahmani, Eunice Yiu, Esra'a Saleh, Alison Gopnik, Oliver Kroemer, Glen Berseth

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: curriculum learning, reinforcement learning

TL;DR: CURATE trains RL agents to complete difficult target tasks by learning a curriculum that dynamically scales the task difficulty to the current capabilities of the agent.

Abstract: Due to fundamental exploration challenges without informed priors or specialized algorithms, agents may be unable to consistently receive informative rewards, leading to inefficient learning. To address these challenges, we introduce CURATE, an automatic curriculum learning algorithm for reinforcement learning agents designed for difficult target task distributions. Through "exploration by exploitation," CURATE dynamically scales the task difficulty to match the agent's current competence. By exploiting its current capabilities that were learned in easier tasks, the agent improves its exploration in more difficult tasks. Our key insight is that the performance increase in tasks that are close to those used for training is inversely proportional to their difficulty, and an agent that chooses a nearby distribution of the easiest unsolved tasks at any given time can automatically induce an easiest-to-hardest curriculum. To achieve this, CURATE conducts policy search in the task space to learn the best task distribution for training the agent. As the agent's mastery grows, the learned curriculum adapts in an approximately easiest-to-hardest and task-directed fashion, efficiently culminating in an agent that can solve the target tasks. Our experiments across three domains of varying task parameterization and dimensionality demonstrate that CURATE learns highly effective curricula, matching or exceeding prior curriculum methods in target task performance. Moreover, CURATE curricula are effective beyond solving the difficult target tasks, yielding broadly capable agents.

Primary Area: reinforcement learning

Submission Number: 15449

Loading