SGT: Self-Guided Transformer for Few-Shot Semantic Segmentation

Published: 01 Jan 2024, Last Modified: 13 Nov 2024ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: For the few-shot segmentation (FSS) task, existing methods attempt to capture the diversity of new classes by fully utilizing the limited support images, such as cross-attention and prototype matching. However, they often overlook the fact that there is variability in different regions of the same object, and intra-image similarity is higher than inter-image similarity. To address these limitations, a Self-Guided Transformer (SGT) is proposed by leveraging intra-image similarity to improve intra-object inconsistencies in this paper. The proposed SGT can selectively guide segmentation, emphasizing the regions that are easily distinguishable while adapting to the challenges caused by less discriminative regions within objects. Through a refined feature interaction scheme and the novel SGT module, our method can achieve state-of-the-art performance on various FSS datasets, demonstrating significant advances in few-shot semantic segmentation. The code is publicly available at https://github.com/HuHaigen/SGT.
Loading