Weakly Supervised Few-Shot Segmentation Through Textual Prompt

Published: 2024, Last Modified: 12 Nov 2025ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recently, significant progress has been made in few-shot segmentation (FSS), which aims to segment unknown objects with only a few support images. However, during both training and testing, FSS still requires pixel-level annotations. When only image-level labels are available, FSS will become a more challenging task, namely weakly supervised few-shot segmentation (WS-FSS). To address this problem, this paper proposes a novel text-driven approach, which replaces pixel-level labels with textual prompts. To guide the model in selecting the target features and capturing the inter-class correlations, a Text-Image Matching Module (TIMM) and a Text Supervision Scheme (TSS) are designed for the feature matching and decoding stages, respectively. Extensive experiments are conducted on two public datasets, PASCAL-5i and COCO-20i. The experimental results demonstrate that our method not only outperforms existing state-of-the-art WS-FSS methods but also achieves comparable or even superior performance to advanced FSS models. The code can be available at https: //github.com/Joseph-Lee-V/Text-WS-FSS.
Loading