Abstract: In this work, we present a new dataset for monocular depth estimation created by extracting images, dense depth maps, and odometer data from a realistic video game simulation, Euro Truck Simulator 2\(^\textrm{TM}\). The dataset is used to train state-of-the-art depth estimation models in both supervised and unsupervised ways, which are evaluated against real-world sequences. Our results demonstrate that models trained exclusively with synthetic data achieve satisfactory performance in the real domain. The quantitative evaluation brings light to possible causes of domain gap in monocular depth estimation. Specifically, we discuss the effects of coarse-grained ground-truth depth maps in contrast to the fine-grained depth estimation. The dataset and code for data extraction and experiments are released open-source.
Loading