CAGNet: Content-Aware Guidance for Salient Object Detection

Sina Mohammadi, Mehrdad Noori, Ali Bahri, Sina Ghofrani Majelan, Mohammad Havaei

2020 (modified: 06 Nov 2023)Pattern Recognit. 2020Readers: Everyone

Abstract: Highlights • A Content-Aware Guidance Network for Salient Object Detection is introduced. • The diverse recognition abilities of multi-level features are exploited to guide the features. • Powerful multi-scale features are extracted by enabling densely connections within large regions in the feature maps. • Our designed loss function outperforms the widely-used Cross-entropy loss by a large margin. • Our method achieves the state-of-the-art performance on five challenging datasets. Abstract Beneficial from Fully Convolutional Neural Networks (FCNs), saliency detection methods have achieved promising results. However, it is still challenging to learn effective features for detecting salient objects in complicated scenarios, in which i) non-salient regions may have “salient-like” appearance; ii) the salient objects may have different-looking regions. To handle these complex scenarios, we propose a Feature Guide Network which exploits the nature of low-level and high-level features to i) make foreground and background regions more distinct and suppress the non-salient regions which have “salient-like” appearance; ii) assign foreground label to different-looking salient regions. Furthermore, we utilize a Multi-scale Feature Extraction Module (MFEM) for each level of abstraction to obtain multi-scale contextual information. Finally, we design a loss function which outperforms the widely used Cross-entropy loss. By adopting four different pre-trained models as the backbone, we prove that our method is very general with respect to the choice of the backbone model. Experiments on six challenging datasets demonstrate that our method achieves the state-of-the-art performance in terms of different evaluation metrics. Additionally, our approach contains fewer parameters than the existing ones, does not need any post-processing, and runs fast at a real-time speed of 28 FPS when processing a 480 × 480 image.

0 Replies