Abstract: The regularity of everyday tasks enables us to
reuse existing solutions for task variations. For instance, most
door-handles require the same basic skill (reach, grasp, turn,
pull), but small adaptations of the basic skill are required
to adapt to the variations that exist (e.g. levers vs. knobs).
We introduce the algorithm “Simultaneous On-line Discovery
and Improvement of Robotic Skills” (SODIRS) that is able
to autonomously discover and optimize skill options for such
task variations. We formalize the problem in a reinforcement
learning context, and use the PIBB algorithm [2] to continually
optimize skills with respect to a cost function. SODIRS discovers
new subskills, or “skill options”, by clustering the costs of trials,
and determining whether perceptual features are able to predict
which cluster a trial will belong to. This enables SODIRS to
build a decision tree, in which the leaves contain skill options
for task variations. We demonstrate SODIRS’ performance in
simulation, as well as on a Meka humanoid robot performing
the ball-in-cup task.
Loading