GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature FieldsDownload PDF

Published: 30 Aug 2023, Last Modified: 03 Jul 2024CoRL 2023 OralReaders: Everyone
Keywords: Robotic Manipulation, Neural Radiance Field, Behavior Cloning
Abstract: It is a long-standing problem in robotics to develop agents capable of executing diverse manipulation tasks from visual observations in unstructured real-world environments. To achieve this goal, the robot will need to have a comprehensive understanding of the 3D structure and semantics of the scene. In this work, we present $\textbf{GNFactor}$, a visual behavior cloning agent for multi-task robotic manipulation with $\textbf{G}$eneralizable $\textbf{N}$eural feature $\textbf{F}$ields. GNFactor jointly optimizes a neural radiance field (NeRF) as a reconstruction module and a Perceiver Transformer as a decision-making module, leveraging a shared deep 3D voxel representation. To incorporate semantics in 3D, the reconstruction module incorporates a vision-language foundation model (e.g., Stable Diffusion) to distill rich semantic information into the deep 3D voxel. We evaluate GNFactor on 3 real-robot tasks and perform detailed ablations on 10 RLBench tasks with a limited number of demonstrations. We observe a substantial improvement of GNFactor over current state-of-the-art methods in seen and unseen tasks, demonstrating the strong generalization ability of GNFactor. Project website: https://yanjieze.com/GNFactor/
Student First Author: yes
Supplementary Material: zip
Instructions: I have read the instructions for authors (https://corl2023.org/instructions-for-authors/)
Video: https://yanjieze.com/GNFactor
Website: https://yanjieze.com/GNFactor
Code: https://github.com/YanjieZe/GNFactor
Publication Agreement: pdf
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/gnfactor-multi-task-real-robot-learning-with/code)
22 Replies

Loading