SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

Kiana Ehsani; Tanmay Gupta; Rose Hendrix; Jordi Salvador; Luca Weihs; Kuo-Hao Zeng; Kunal Pratap Singh; Yejin Kim; Winson Han; Alvaro Herrasti; Ranjay Krishna; Dustin Schwenk; Eli VanderBilt; Aniruddha Kembhavi

SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

Kiana Ehsani, Tanmay Gupta, Rose Hendrix, Jordi Salvador, Luca Weihs, Kuo-Hao Zeng, Kunal Pratap Singh, Yejin Kim, Winson Han, Alvaro Herrasti, Ranjay Krishna, Dustin Schwenk, Eli VanderBilt, Aniruddha Kembhavi

Published: 01 Jan 2024, Last Modified: 18 May 2025CVPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Reinforcement learning (RL) with dense rewards and imitation learning (IL) with human-generated trajectories are the most widely used approaches for training modern embodied agents. RL requires extensive reward shaping and auxiliary losses and is often too slow and ineffective for long-horizon tasks. While IL with human supervision is effective, collecting human trajectories at scale is extremely expensive. In this work, we show that imitating shortest-path planners in simulation produces agents that, given a language instruction, can proficiently navigate, ex-plore, and manipulate objects in both simulation and in the real world using only RGB sensors (no depth map or GPS coordinates). This surprising result is enabled by our end-to-end, transformer-based, Spocarchitecture, power-ful visual encoders paired with extensive image augmentation, and the dramatic scale and diversity of our training data: millions of frames of shortest-path-expert trajectories collected inside approximately 200,000 procedu-rally generated houses containing 40,000 unique 3D as-sets. Our models, data, training code, and newly proposed 10-task benchmarking suite Choresare available in spoc-robot.github.io.

Loading