Proximal Curriculum for Reinforcement Learning Agents

Published: 21 Apr 2023, Last Modified: 21 Apr 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: We consider the problem of curriculum design for reinforcement learning (RL) agents in contextual multi-task settings. Existing techniques on automatic curriculum design typically require domain-specific hyperparameter tuning or have limited theoretical underpinnings. To tackle these limitations, we design our curriculum strategy, ProCuRL, inspired by the pedagogical concept of Zone of Proximal Development (ZPD). ProCuRL captures the intuition that learning progress is maximized when picking tasks that are neither too hard nor too easy for the learner. We mathematically derive ProCuRL by analyzing two simple learning settings. We also present a practical variant of ProCuRL that can be directly integrated with deep RL frameworks with minimal hyperparameter tuning. Experimental results on a variety of domains demonstrate the effectiveness of our curriculum strategy over state-of-the-art baselines in accelerating the training process of deep RL agents.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url:
Changes Since Last Submission: **Changes for the camera-ready revision** * Section 1: We added a link to the Github repository. * Section 4 / Appendix C: We updated Figures 3, 8, 9, 10, and 11 with confidence intervals computed using the t-distribution table. * Appendix C: We added Figure 12 which shows the distribution of tasks for the uniform pool and harder pool.
Supplementary Material: zip
Assigned Action Editor: ~Martha_White1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 681