Composing Complex Skills by Learning Transition Policies with Proximity Reward Induction

Youngwoon Lee*, Shao-Hua Sun*, Sriram Somasundaram, Edward Hu, Joseph J. Lim

Sep 27, 2018 ICLR 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Intelligent creatures acquire complex skills by exploiting previously learned skills and learning to transition between them. To empower machines with this ability, we propose transition policies which effectively connect primitive skills to perform sequential tasks without handcrafted rewards. To effectively train our transition policies, we introduce proximity predictors which induce rewards gauging proximity to suitable initial states for the next skill. The proposed method is evaluated on a diverse set of experiments for continuous control in both bi-pedal locomotion and robotic arm manipulation tasks in MuJoCo. We demonstrate that transition policies enable us to effectively learn complex tasks and the induced proximity reward computed using the initiation predictor improves training efficiency. Videos of policies learned by our algorithm and baselines can be found at https://sites.google.com/view/transitions-iclr2019 .
  • Keywords: reinforcement learning, hierarchical reinforcement learning, continuous control, modular network
  • TL;DR: Transition policies enable agents to execute learned skills smoothly to perform complex tasks.
0 Replies

Loading