Adaptive Monocular Depth Estimation with Masked Image Consistency

Damian Sójka; Marc Masana; Bartłomiej Twardowski; Sebastian Cygert

Adaptive Monocular Depth Estimation with Masked Image Consistency

Damian Sójka, Marc Masana, Bartłomiej Twardowski, Sebastian Cygert

Published: 10 Jun 2025, Last Modified: 11 Jul 2025PUT at ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: test-time adaptation, monocular depth estimation

TL;DR: Leveraging masked image consistency with scale alignment allows for more effective continual test-time adaptation for monocular depth estimation.

Abstract: Current Continual Test-Time Adaptation methods for Monocular Depth Estimation rely on extra data and lack efficiency, using auxiliary source models or adjacent video frames, which increase computational demand. We propose to use masked image modeling, extending Masked Image Consistency, to address these limitations. Together with the use of scale alignment to account for varying camera setups, our proposed approach enforces consistency between masked and unmasked image predictions, which shows empirical results that highlight its effectiveness in autonomous driving scenarios, achieving performance comparable with state-of-the-art.

Supplementary Material: zip

Submission Number: 43

Loading