Curriculum as Selective Data Acquisition: Toward Reliable Generalization in Goal-Conditioned RL

ICLR 2026 Conference Submission20369 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Curriculum Learning, Universal Value Function Approximators (UVFA), Goal-Conditioned Reinforcement Learning, Open-Ended Learning, Reward Shaping
TL;DR: We demonstrate that curriculum-guided training significantly improves UVFA generalization in open-ended RL by adaptively focusing on difficult goals, yielding faster convergence and higher success rates compared to uniform goal sampling.
Abstract: Abstract: We study curriculum learning in goal-conditioned reinforcement learning (GCRL) through the lens of data selection. Instead of sampling all goals uniformly, we bias sampling toward underachieved goals, thereby shifting the state–goal distribution seen by the agent. Using universal value function approximators (UVFAs) with potential-based reward shaping in GridWorld, we compare uniform and curriculum-guided training. Our results show that curricula alter goal coverage, reduce approximation error, and improve success on difficult edge goals. These findings highlight curriculum learning as a principled mechanism for selective data acquisition, suggesting a pathway toward more persistent and open-ended agents.
Primary Area: reinforcement learning
Submission Number: 20369
Loading