From Child's Play to AI: Insights into Automated Causal Curriculum Learning

Published: 20 Oct 2023, Last Modified: 30 Nov 2023IMOL@NeurIPS2023EveryoneRevisionsBibTeX
Keywords: curriculum learning, open learning, children, reinforcement learning, intrinsic reward
TL;DR: In curriculum learning, RL agents benefit from using progress as an intrinsic reward signal like children do.
Abstract: We study how reinforcement learning algorithms and children develop a causal curriculum to achieve a challenging goal that is not solvable at first. Adopting the Procgen environments that include various challenging tasks, we found that 5- to 7-year-old children actively used their current level competence to determine their next step in the curriculum and made improvements to their performance during this process as a result. This suggests that children treat their level competence as an intrinsic reward, and are motivated to master easier levels in order to do better at the more difficult one, even without explicit reward. To evaluate RL agents, we exposed them to the same demanding Procgen environments as children and employed several curriculum learning methodologies. Our results demonstrate that RL agents that emulate children by incorporating level competence as an additional reward signal exhibit greater stability and are more likely to converge during training, compared to RL agents that are solely reliant on extrinsic reward signals for game-solving. Curriculum learning may also offer a significant reduction in the number of frames needed to solve a target environment. Taken together, our human-inspired findings suggest a potential path forward for addressing catastrophic forgetting or domain shift during curriculum learning in RL agents.
Submission Number: 26
Loading