DIVIDE: Learning a Domain-Invariant Geometric Space for Depth Estimation

Dongseok Shim, H. Jin Kim

Published: 01 Jan 2024, Last Modified: 13 Nov 2024IEEE Robotics Autom. Lett. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Learning-based depth estimation requires a large amount of real-world training data, which can be both expensive and time-consuming to acquire. As a result, utilizing fully-annotated synthetic data from virtual environments has emerged as a promising alternative. However, networks trained with synthetic data typically exhibit sub-optimal performance due to the inherent distribution gap between virtual and real domain. To address this issue, we propose a new domain adaptation framework for depth estimation, DIVIDE, which learns a domain-invariant geometric space to minimize the domain shift. In particular, DIVIDE disentangles the domain-specific components of the input image and removes them while preserving its structural information for accurate depth estimation. Our proposed method outperforms state-of-the-art results in domain adaptation for depth estimation as well as achieves better style transfer with high image fidelity and structural consistency.