Do As You Teach: A Multi-Teacher Approach to Self-Play in Deep Reinforcement Learning

Chaitanya Kharyal; Tanmay Kumar Sinha; SaiKrishna Gottipati; Srijita Das; Matthew E. Taylor

Do As You Teach: A Multi-Teacher Approach to Self-Play in Deep Reinforcement Learning

Chaitanya Kharyal, Tanmay Kumar Sinha, SaiKrishna Gottipati, Srijita Das, Matthew E. Taylor

08 Oct 2022 (modified: 05 May 2023)Deep RL Workshop 2022Readers: Everyone

Keywords: Curriculum learning, Deep RL, goal conditioned RL

Abstract: A long-running challenge in the reinforcement learning (RL) community has been to train a goal-conditioned agent in a sparse reward environment such that it could also generalize to other unseen goals. Empirical results in Fetch-Reach and a novel driving simulator demonstrate that our proposed algorithm, Multi-Teacher Asymmetric Self-Play, allows one agent (i.e., a teacher) to create a successful curriculum for another agent (i.e., the student). Surprisingly, results also show that training with multiple teachers actually helps the student learn faster. Our analysis shows that multiple teachers can provide better coverage of the state space, selecting diverse sets of goals, and better helping a student learn. Moreover, results show that completely new students can learn offline from the goals generated by teachers that trained with a previous student. This is crucial in the context of industrial robotics where repeatedly training a teacher agent is expensive and sometimes infeasible.

Supplementary Material: zip

0 Replies

Loading