Keywords: precise manipulation, robot learning, few shot learning
Abstract: In this work, we study how to build a robotic system
that can solve multiple 3D manipulation tasks given language
instructions. To be useful in industrial and household domains,
such a system should be capable of learning new tasks with
few demonstrations and solving them precisely. Prior works, like
PerAct and RVT, have studied this problem, however,
they often struggle with tasks requiring high precision. We study
how to make them more effective, precise, and fast. Using a
combination of architectural and system-level improvements, we
propose RVT-2, a multitask 3D manipulation model that is 6X
faster in training and 2X faster in inference than its predecessor
RVT. RVT-2 achieves a new state-of-the-art on RLBench,
improving the success rate from 65% to 82%. RVT-2 is also
effective in the real world, where it can learn tasks requiring
high precision, like picking up and inserting plugs, with just
10 demonstrations
Submission Number: 27
Loading