Abstract: Few-shot semantic segmentation (FSS) endeavors to seg-
ment unseen classes with only a few labeled samples. Cur-
rent FSS methods are commonly built on the assumption
that their training and application scenarios share simi-
lar domains, and their performances degrade significantly
while applied to a distinct domain. To this end, we propose
to leverage the cutting-edge foundation model, the Seg-
ment Anything Model (SAM), for generalization enhance-
ment. The SAM however performs unsatisfactorily on do-
mains that are distinct from its training data, which primar-
ily comprise natural scene images, and it does not support
automatic segmentation of specific semantics due to its in-
teractive prompting mechanism. In our work, we introduce
APSeg, a novel auto-prompt network for cross-domain few-
shot semantic segmentation (CD-FSS), which is designed to
be auto-prompted for guiding cross-domain segmentation.
Specifically, we propose a Dual Prototype Anchor Transfor-
mation (DPAT) module that fuses pseudo query prototypes
extracted based on cycle-consistency with support proto-
types, allowing features to be transformed into a more sta-
ble domain-agnostic space. Additionally, a Meta Prompt
Generator (MPG) module is introduced to automatically
generate prompt embeddings, eliminating the need for man-
ual visual prompts. We build an efficient model which can
be applied directly to target domains without fine-tuning.
Extensive experiments on four cross-domain datasets show
that our model outperforms the state-of-the-art CD-FSS
method by 5.24% and 3.10% in average accuracy on 1-shot
and 5-shot settings, respectively.
Loading