DiSa: Saliency-Aware Foreground-Background Disentangled Framework for Open-Vocabulary Semantic Segmentation

Zhen Yao; Xin Li; TAOTAO JING; shuai zhang; Mooi Choo Chuah

DiSa: Saliency-Aware Foreground-Background Disentangled Framework for Open-Vocabulary Semantic Segmentation

Zhen Yao, Xin Li, TAOTAO JING, shuai zhang, Mooi Choo Chuah

15 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Foreground-Background Disentanglement, Vision-Language Model, Open-Vocabulary Semantic Segmentation

TL;DR: We leverage saliency to disentangle foreground and background features, mitigating Foreground Bias and improving spatial localization for open-vocabulary semantic segmentation.

Abstract: Open-vocabulary semantic segmentation aims to assign labels to every pixel in an image based on text labels. State-of-the-art approaches typically utilize vision-language models (VLMs), such as CLIP, for dense prediction. However, VLMs, pre-trained on image-text pairs, are biased toward salient, object-centric regions and exhibit two critical limitations when adapted to semantic segmentation: (i) Foreground Bias, which tends to ignore background regions, and (ii) Limited Spatial Localization, resulting in blurred object boundaries. To address these limitations, we introduce DiSa, a novel saliency-aware foreground-background disentangled framework. By explicitly incorporating saliency cues in our designed Saliency-aware Disentanglement Module (SDM), DiSa separately models foreground and background ensemble features in a divide-and-conquer manner. Additionally, we propose a Hierarchical Refinement Module (HRM) that leverages pixel-wise spatial contexts and enables channel-wise feature refinement through multi-level updates. Extensive experiments on six benchmark open-vocabulary semantic segmentation datasets demonstrate that DiSa consistently outperforms current state-of-the-art methods.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 6328

Loading