Investigating the Sim2real Gap in Computer Vision for Robotics

Investigating the Sim2real Gap in Computer Vision for Robotics

TMLR Paper1990 Authors

01 Jan 2024 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: A major challenge in designing machine learning systems for the real world is the sim2real gap, i.e. the change in performance when the system is transferred from simulation to the physical environment. Although many algorithms have been proposed to reduce this gap, it is not well understood. In this paper, we perform an empirical study of the sim2real gap for popular models in three standard computer vision tasks, monocular depth estimation, object detection, and image inpainting, in a robotic manipulation environment. We find that the lighting conditions significantly affect the gap for monocular depth estimation while object properties affect the gap for object detection and image inpainting, and these qualitative observations remain stable with different models and renderers.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Florian_Shkurti1

Submission Number: 1990

Loading