Surface Normals: Always-On Perception for Vision-Based Robots

Published: 24 Apr 2024, Last Modified: 24 Apr 2024ICRA 2024 Workshop on 3D Visual Representations for Robot ManipulationEveryoneRevisionsBibTeXCC BY 4.0
Keywords: surface normal estimation, 3D reconstruction
TL;DR: We push the limits of single-image surface normal estimation by rethinking the inductive biases needed for the task.
Abstract: In recent years, the usefulness of surface normal estimation methods has been demonstrated in various areas of robotics and computer vision. State-of-the-art methods show strong generalization capability and are highly efficient, running in real-time even on laptop computers. This makes them a strong candidate for being an "always-on" perception for vision-based robots. Using the extracted surface normal cues as a foundation, task-(and domain-)specific functionalities can be built and called "on-demand". In this paper, we push the limits of single-image surface normal estimation by rethinking the inductive biases needed for the task. Specifically, we propose to (1) utilize the per-pixel ray direction and (2) encode the relationship between neighboring surface normals by learning their relative rotation. The proposed method can generate crisp --- yet, piecewise smooth --- predictions for challenging in-the-wild images of arbitrary resolution and aspect ratio. Compared to a recent ViT-based state-of-the-art model, our method shows a stronger generalization ability, despite being trained on an orders of magnitude smaller dataset. The code is available at https://github.com/baegwangbin/DSINE. At the workshop, we will show a real-time demo of the proposed method using a laptop computer.
Submission Number: 16
Loading