Abstract: The goal of few-shot semantic segmentation is to build a model using a small amount of annotated data to generalize to a new object class. When there are significant differences between the target domain and the source domain, segmentation performance usually deteriorates significantly. Existing methods mainly rely on annotated data from the source domain to achieve domain-independent feature extraction through feature transformation. However, bridging the huge domain gap without guidance from the target domain data remains highly challenging. To effectively address the domain shift problem in cross-domain few-shot semantic segmentation (CD-FSS), this paper introduces a new method called Target-guided Cross-domain Few-shot Semantic Segmentation (TGCM), which introduces a labeled sample from the target domain to guide model learning during training. Specifically, TGCM utilizes one-shot image and mask from the target domain as auxiliary data and employs the CutMix method for data augmentation on the source domain training data. Subsequently, the task-adaptive feature transformer module (TAFT) and domain channel alignment (DCA) module are introduced to translate the features of fused images into the feature space related to the target domain, reducing the domain drift effect caused by cross-domain discrepancies. Finally, we present a Dynamic Prediction (DP) strategy to help the model gradually improve segmentation performance. Experimental results show that our model achieves significant improvement in CD-FSS, with average accuracies higher by 4.71% and 2.95% on 1-shot and 5-shot, respectively, compared to the baseline methods in CD-FSS. The code and dataset are available on https://github.com/08-401/TGCM_ACCV.
External IDs:dblp:conf/accv/WeiLCQ24
Loading