Multi-Modal Indoor Dataset for Event-based Monocular Depth Estimation by Mobile Robots

Published: 21 Sept 2025, Last Modified: 14 Oct 2025NeuRobots 2025 SpotlightTalkPosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-Modal Dataset, Event-based Cameras, Monocular Depth Estimation, Sensor Fusion, Robot Vision
Abstract: This article introduces a multi-modal indoor dataset for event-based monocular depth estimation by mobile robots. The dataset was recorded on a humanoid platform and includes synchronized RGB, depth, event streams, and IMU data from Intel RealSense D435i, DAVIS346, and Prophesee EVK4 sensors. To provide a baseline, we implement a CycleGAN model that learns bidirectional mappings between the event-representation and the depth domain. We evaluate multiple state-of-the-art representations showing that event-based inputs could outperform frame-only inputs across accuracy, perceptual quality, and geometric reliability. The dataset and baseline together provide a reproducible testbed for event-based perception in indoor mobile robotics.
Submission Number: 7
Loading