PIDDNet: RGB-Depth Fusion Network for Real-time Semantic Segmentation

Published: 01 Jan 2023, Last Modified: 13 Nov 2024ICTC 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: For RGB semantic segmentation, a two-branch network was proposed to effectively utilize both local detail information and global contextual information within an RGB image. This architecture combines a shallow spatial path with a deeper context path, resulting in high performance and FPS. Research on RGB-Depth segmentation has shown the performance gain that the depth map could provide complementary information to the RGB model. However, the advantage of fusing RGB and depth map within a two-branch network framework is unclear due to the distinct characteristics of these modalities. To address this, we present a novel fusion RGB-Depth architecture that takes into account the attributes of local context, global context, RGB, and depth map. Through the bidirectional image depth fusion technique, we effectively leverage each of the modalities, achieving a performance of 81.23 mIoU. This marks a gain of 1.27% when compared to the RGB-only model and 0.45% when contrasted with the element-wise feature addition fusion baseline.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview