Dual-Branch Fusion with Style Modulation for Cross-Domain Few-Shot Semantic Segmentation

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Cross-Domain Few-Shot Semantic Segmentation (CD-FSS) aims to achieve pixel-level segmentation of novel categories across various domains by transferring knowledge from the source domain leveraging limited samples. The main challenge in CD-FSS is bridging the inter-domain gap and addressing the scarcity of labeled samples in the target domain to enhance both generalization and discriminative abilities. Current methods usually resort to additional networks and complex strategy to embrace domain variability, which inevitably increases the training costs. This paper proposes a Dual-Branch Fusion with Style Modulation (DFSM) method to tackle this issues. We specifically deploy a parameter-free Grouped Style Modulation (GSM) layer that captures and adjusts a wide spectrum of potential feature distribution changes, thus improving the model's solution efficiency. Additionally, to overcome data limitations and enhance adaptability in the target domain, we develope a Dual-Branch Fusion (DBF) strategy which achieves accurate pixel-level prediction results by combining predicted probability maps through weighted fusion, thereby enhancing the discriminative ability of the model. We evaluate the proposed method on multiple widely-used benchmark datasets, including FSS-1000, ISIC, Chest X-Ray, and Deepglobe, and demonstrate superior performance compared to state-of-the-art methods in CD-FSS tasks.
Primary Subject Area: [Content] Vision and Language
Secondary Subject Area: [Content] Media Interpretation
Relevance To Conference: This method enhances few-shot semantic segmentation across domains by narrowing the gap. It combines multimedia processing and cross-domain learning to showcase the potential of multimedia technology.
Supplementary Material: zip
Submission Number: 5441
Loading