LiteGfm: A Lightweight Self-supervised Monocular Depth Estimation Framework for Artifacts Reduction via Guided Image Filtering

Zhilin He, Yawei Zhang, Jingchang Mu, Xiaoyue Gu, Tianhao Gu

Published: 28 Oct 2024, Last Modified: 09 Nov 2025CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: Facing two significant challenges for monocular depth estimation under a lightweight network, including the preservation of detail information and the artifact reduction of the predicted depth maps, this paper proposes a self-supervised monocular depth estimation framework, called LiteGfm. It contains a DepthNet with an Anti-Artifact Guided (AAG) module and a PoseNet. In the AAG module, a Guided Image Filtering with cross-detail masking is first designed to filter the input features of the decoder for preserving comprehensive detail information. Second, a filter kernel generator is proposed to decompose the Sobel operator along the vertical and horizontal axes for achieving cross-detail masking, which better captures the structure and edge feature for minimizing artifacts. Furthermore, a boundary-aware loss between the reconstructed and input images is presented to preserve high-frequency details for decreasing artifacts. Extensive experimental results demonstrate that LiteGfm under 1.9M parameters gets more optimal performance than state-of-the-art methods.

External IDs:doi:10.1145/3664647.3681505