Achieving Gentle Manipulation with Deep Reinforcement Learning

Sandy H. Huang, Martina Zambelli, Yuval Tassa, Jackie Kay, Murilo F. Martins, Patrick M. Pilarski, Raia Hadsell

13 Jul 2020OpenReview Archive Direct UploadReaders: Everyone

Abstract: Robots must know how to be gentle when they need to interact with fragile objects, or when the robot itself is prone to wear-and-tear. We propose an approach that enables deep reinforcement learning to train policies that are gentle, both during exploration and task execution. Our approach involves augmenting the (task) reward with a penalty for non-gentleness. However, augmenting with only this penalty impairs learning: policies get stuck in a local optimum of avoiding all contact with the environment. Introducing surprise-based intrinsic rewards solves this problem, as long as the right kind of surprise is chosen—penalty-based surprise is more effective than the typical dynamics-based surprise. Videos are available at http://sites.google.com/view/gentlemanipulation.

0 Replies