A Dense Multicross Self-Attention and Adaptive Gated Perceptual Unit Method for Few-Shot Semantic Segmentation

Published: 01 Jan 2024, Last Modified: 13 May 2025IEEE Trans. Artif. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Few-shot semantic segmentation (FSSS) is a pivotal and prevalent research task for advancing the field of artificial intelligence. The task entails learning to differentiate between various classes in a support set and leveraging this knowledge on samples within a query set. However, traditional deep learning methods tend to underperform in this context due to limited training samples and subtle correlations between query and support images that are inadequately utilized. Existing methods for FSSS often compress support information into prototype categories or utilize only partial pixel-level support information, resulting in a significant impact. In this article, we propose a novel auto FSSS method that employs dense multicross self-attention and adaptive gate perception units to tackle this challenge. Specifically, our proposed method treats each query pixel as a label and predicts its segmentation label as the sum of labels of all support pixels. The method fully utilizes foreground and background support information through multilevel pixel correlations between paired query and support features to achieve state-of-the-art performance with only 1–5 annotated images. Moreover, our proposed adaptive gating perception unit filters and weighs each support image information by adaptively learning the gating values. This ensures the model selects only the most relevant support image information to the current query image. The proposed method is evaluated on several popular FSSS datasets and compared with state-of-the-art methods. Additionally, a visual analysis of our method is conducted to demonstrate its ability to distinguish different semantic categories and exhibit robustness at segmentation boundaries.
Loading