Reinforcement Learning: the Sooner the Better, or the Later the Better?

Published: 01 Jan 2016, Last Modified: 07 Aug 2024UMAP 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Reinforcement Learning (RL) is one of the best machine learning approaches for decision making in interactive environments. RL focuses on inducing effective decision making policies with the goal of maximizing the agent's cumulative reward. In this study, we investigated the impact of both immediate and delayed reward functions on RL-induced policies and empirically evaluated the effectiveness of induced policies within an Intelligent Tutoring System called Deep Thought. Moreover, we divided students into Fast and Slow learners based on their incoming competence as measured by their average response time on the initial tutorial level. Our results show that there was a significant interaction effect between the induced policies and the students' incoming competence. More specifically, Fast learners are less sensitive to learning environments in that they can learn equally well regardless of the pedagogical strategies employed by the tutor, but Slow learners benefit significantly more from effective pedagogical strategies than from ineffective ones. In fact, with effective pedagogical strategies the slow learners learned as much as their faster peers, but with ineffective pedagogical strategies the former learned significantly less than the latter.
Loading