Using Deep Reinforcement Learning to Train and Evaluate Instructional Sequencing Policies for an Intelligent Tutoring System

Jithendaraa Subramanian; David Mostow

Using Deep Reinforcement Learning to Train and Evaluate Instructional Sequencing Policies for an Intelligent Tutoring System

Jithendaraa Subramanian, David Mostow

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Deep Reinforcement Learning, Intelligent Tutoring Systems, Adaptive policy, Instructional Sequencing

Abstract: We present STEP, a novel Deep Reinforcement Learning solution to the problem of learning instructional sequencing. STEP has three components: 1. Simulate the student by fitting a knowledge tracing model to data logged by an intelligent tutoring system. 2. Train instructional sequencing policies by using Proximal Policy Optimization. 3. Evaluate the learned instructional policies by estimating their local and global impact on learning gains. STEP leverages the student model by representing the student’s knowledge state as a probability vector of knowing each skill and using the student’s estimated learning gains as its reward function to evaluate candidate policies. A learned policy represents a mapping from each state to an action that maximizes the reward, i.e. the upward distance to the next state in the multi-dimensional space. We use STEP to discover and evaluate potential improvements to a literacy and numeracy tutor used by hundreds of children in Tanzania.

One-sentence Summary: A Deep Reinforcement Learning framework that can be used by Intelligent Tutoring System to learn an instructional policy that maximizes student learning gains.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=3NNK6Orb08R

7 Replies

Loading