Keywords: curriculum design, reinforcement learning, zone of proximal development
TL;DR: We propose a novel curriculum strategy for deep reinforcement learning agents based on the concept of Zone of Proximal Development.
Abstract: We consider the problem of curriculum design for reinforcement learning (RL) agents in contextual multi-task settings. Existing techniques on automatic curriculum design typically have limited theoretical underpinnings or require domain-specific hyperparameter tuning. To tackle these limitations, we design our curriculum strategy, ProCuRL, from basic principles inspired by the pedagogical concept of Zone of Proximal Development (ZPD). We mathematically derive ProCuRL by formalizing the ZPD concept, which suggests that learning progress is maximized when picking tasks that are neither too hard nor too easy for the learner. We also present a practical variant of ProCuRL that can be directly integrated with deep RL frameworks with minimal hyperparameter tuning. Experimental results on a variety of domains demonstrate the effectiveness of our curriculum strategy over state-of-the-art baselines in accelerating the training process of deep RL agents.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)