Salient Object Detection With Edge-Guided Learning and Specific Aggregation

Liqian Zhang, Qing Zhang

Published: 2024, Last Modified: 17 Apr 2025IEEE Trans. Circuits Syst. Video Technol. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recently, the performance of salient object detection (SOD) has been significantly improved by utilizing edge information for auxiliary training. However, the extraction and utilization of edge cues and multi-level feature fusion are still two issues in existing edge-aware models. In this paper, we devise a novel SOD network with edge-guided learning and specific aggregation, named ELSA-Net, to cooperatively address these two issues. First, we propose the edge-guided learning strategy, which utilizes edge cues as low-level guidance to improve saliency prediction. Specifically, we design a two-stream model that uses a saliency branch and an edge branch to detect the interior and the boundary of salient objects, respectively. Then, an edge-guided interaction module (EGI) is further designed to achieve feature enhancement by embedding edge information into the saliency branch as the spatial weights. In addition, two specific aggregation modules are proposed for the progressive fusion of multi-level features in the above two streams, thus making full use of semantic and detailed information. The high-level interactive fusion module (HIF) leverages the correlation between two deeper features to obtain more powerful global contexts. And the low-level weighted fusion module (LWF) focuses on the complement of fine information by selectively integrating input features. Extensive experiments show that the proposed approach outperforms 19 state-of-the-art methods on five datasets, which validates its effectiveness both quantitatively and qualitatively.