LSHNet: Leveraging Structure-Prior With Hierarchical Features Updates for Salient Object Detection in Optical Remote Sensing Images
Abstract: Salient object detection in optical remote sensing images (ORSI-SOD) is a task that detects the most prominent context in optical remote sensing images (ORSIs). ORSI-SOD is challenging due to the diverse sizes and shapes of the targets, their irregular distribution in the scene, and the occlusion caused by surrounding environments. Recently, many deep learning-based models have demonstrated promising performance in ORSI-SOD. However, there still remains considerable potential for addressing these challenges in ORSI-SOD. In this article, we propose a novel architecture, LSHNet. We propose a dual-branch architecture consisting of an edge encoder that leverages structure features using edges as the structure-prior and an image encoder that extracts context features from the image. We propose three modules. Image-structure fusion module (ISFM) integrates the two-stream features extracted from dual branch encoders through intrapatch and internal-patch attention mechanisms to utilize diverse receptive fields. Local-global feature fusion module (LGFM) transfers global features representing the target to local feature maps to discriminate the region of the targets from background clutters. The semantic cues updating module (SCUM) updates the representative features of the target from high-level to low-level. By integrating hierarchical information effectively, global features extracted from multilayers can be rectified. We experiment with the three main evaluated datasets in ORSI-SOD: ORSSD, EORSSD, and ORSI-4199. We demonstrate the promising results on the three datasets and analyze the effectiveness of the proposed modules in the ablation study.
Loading