APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation

Weizhao He, Yang Zhang, Wei Zhuo, Linlin Shen, JiaqiYang, Songhe Deng, Liang Sun

Published: 12 Jun 2024, Last Modified: 16 Oct 2025CVPR 2024EveryoneCC BY 4.0

Abstract: Few-shot semantic segmentation (FSS) endeavors to seg- ment unseen classes with only a few labeled samples. Cur- rent FSS methods are commonly built on the assumption that their training and application scenarios share simi- lar domains, and their performances degrade significantly while applied to a distinct domain. To this end, we propose to leverage the cutting-edge foundation model, the Seg- ment Anything Model (SAM), for generalization enhance- ment. The SAM however performs unsatisfactorily on do- mains that are distinct from its training data, which primar- ily comprise natural scene images, and it does not support automatic segmentation of specific semantics due to its in- teractive prompting mechanism. In our work, we introduce APSeg, a novel auto-prompt network for cross-domain few- shot semantic segmentation (CD-FSS), which is designed to be auto-prompted for guiding cross-domain segmentation. Specifically, we propose a Dual Prototype Anchor Transfor- mation (DPAT) module that fuses pseudo query prototypes extracted based on cycle-consistency with support proto- types, allowing features to be transformed into a more sta- ble domain-agnostic space. Additionally, a Meta Prompt Generator (MPG) module is introduced to automatically generate prompt embeddings, eliminating the need for man- ual visual prompts. We build an efficient model which can be applied directly to target domains without fine-tuning. Extensive experiments on four cross-domain datasets show that our model outperforms the state-of-the-art CD-FSS method by 5.24% and 3.10% in average accuracy on 1-shot and 5-shot settings, respectively.