R-MSFM: Recurrent Multi-Scale Feature Modulation for Monocular Depth Estimating

Zhongkai Zhou, Xinnan Fan, Pengfei Shi, Yuanxue Xin

Published: 2021, Last Modified: 04 Nov 2025ICCV 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this paper, we propose Recurrent Multi-Scale Feature Modulation (R-MSFM), a new deep network architecture for self-supervised monocular depth estimation. R-MSFM extracts per-pixel features, builds a multi-scale feature modulation module, and iteratively updates an inverse depth through a parameter-shared decoder at the fixed resolution. This architecture enables our R-MSFM to maintain semantically richer while spatially more precise representations and avoid the error propagation caused by the traditional U-Net-like coarse-to-fine architecture widely used in this domain, resulting in strong generalization and efficient parameter count. Experimental results demonstrate the superiority of our proposed R-MSFM both at model size and inference speed, and show the state-of-the-art results on the KITTI benchmark. Code is available at https://github.com/jsczzzk/R-MSFM

External IDs:dblp:conf/iccv/ZhouFSX21