Deep dynamic policy programming for robot control with raw images

Yoshihisa Tsurumine, Yunduan Cui, Eiji Uchibe, Takamitsu Matsubara

2017 (modified: 09 Jun 2022)IROS 2017Readers: Everyone

Abstract: Deep reinforcement learning has drawn much attention in robot control since it enables agents to learn control policies from very high dimensional states such as raw images. On the other hand, its dependency upon the availability of a significant quantity of training samples and its fragility in learning makes it difficult to apply for real world robot tasks. To alleviate these issues we propose Deep Dynamic Policy Programming (DDPP), which combines the sample efficiency and smooth policy updates of dynamic policy programming with the contemporary deep reinforcement learning framework. The effectiveness of the proposed method is first demonstrated in a simulation of the robot arm control problem, with comparison to Deep Q-Networks. As validation on a real robot system, DDPP also successfully learned the flipping of a handkerchief with a NEXTAGE humanoid robot using a reduced number of learning samples, whereas Deep Q-Networks failed to learn the task.

0 Replies