Keywords: reinforcement learning, autonomous, reset-free, manipulation
TL;DR: A practical and efficient real-world robot system that can self-improve by reinforcement learning.
Abstract: In imitation and reinforcement learning (RL), the cost of human supervision limits the amount of data that the robots can be trained on. While RL offers a framework for building self-improving robots that can learn via trial-and-error autonomously, practical realizations end up requiring extensive human supervision for reward function design and repeated resetting of the environment between episodes of interactions. In this work, we propose MEDAL++, a novel design for self-improving robotic systems: given a small set of expert demonstrations at the start, the robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations. The policy and reward function are learned end-to-end from high-dimensional visual inputs, bypassing the need for explicit state estimation or task-specific pre-training for visual encoders used in prior work. We first evaluate our proposed system on a simulated non-episodic benchmark EARL, finding that MEDAL++ is both more data efficient and gets up to 30% better final performance compared to state-of-the-art vision-based methods. Our real-robot experiments show that MEDAL++ can be applied to manipulation problems in larger environments than those considered in prior work, and autonomous self-improvement can improve the success rate by 30% to 70% over behavioral cloning on just the expert data.
Student First Author: yes
Supplementary Material: zip
Instructions: I have read the instructions for authors (https://corl2023.org/instructions-for-authors/)
Website: https://architsharma97.github.io/self-improving-robots/
Code: https://github.com/rehaanahmad2013/self-improving-robots/
Publication Agreement: pdf
Poster Spotlight Video: mp4
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/self-improving-robots-end-to-end-autonomous/code)
5 Replies
Loading