Keywords: Robotics, Reinforcement Learning, Exploration, Real World
Abstract: Reinforcement learning (RL) enables robots to learn skills from interactions with the real world.
In practice, the unstructured step-based exploration used in Deep RL -- often very successful in simulation -- leads to jerky motion patterns on real robots.
Consequences of the resulting shaky behavior are poor exploration, or even damage to the robot.
We address these issues by adapting state-dependent exploration (SDE) to current Deep RL algorithms.
To enable this adaptation, we propose two extensions to the original SDE, using more general features and re-sampling the noise periodically, which leads to a new exploration method generalized state-dependent exploration (gSDE).
We evaluate gSDE both in simulation, on PyBullet continuous control tasks, and directly on three different real robots: a tendon-driven elastic robot, a quadruped and an RC car.
The noise sampling interval of gSDE enables a compromise between performance and smoothness, which allows training directly on the real robots without loss of performance.
Supplementary Material: zip
Poster: png
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2005.05719/code)
14 Replies
Loading