Pre-trained Image Encoder for Data-Efficient Reinforcement Learning and Sim-to-Real transfer on Robotic-Manipulation tasks
Keywords: Reinforcement Learning, Computer Vision, Sim-to-Real
TL;DR: Simple pipeline for vision-based robotic-manipulation Reinforcement Learning: an encoder is pre-trained from multiple computer-vision objectives, then frozen and used by an RL agent to bypass the raw images. Allows simple sim-to-real transfer.
Abstract: Sample-efficiency is still a major challenge for reinforcement-learning (RL) algorithms, particularly when learning directly from image inputs. We propose a simple two-step pipeline: First, learn a visual representation of the scene by pre-training an encoder from multiple supervised computer-vision objectives, then train an RL agent which can focus solely on solving the task.
We evaluate our method on 3 realistic manipulation tasks with a simulated 6-degrees-of-freedom robot. We show that not only is our method much more sample-efficient than an end-to-end baseline, but it also reaches a higher final success rate, even solving one of the tasks where the baseline fails to make any progress.
Additionally, by adding domain randomization techniques into our pipeline, we are able to solve a simpler reaching task consistently in the real world via zero-shot sim-to-real transfer.
1 Reply
Loading