Dual-Structure Self-Distilled Learning for Enhancing Unsupervised Semantic Segmentation

Jing Luo; Xiaoliu Luo; Mengzhu Wang; Zuotao Fu; Xu Wang; Taiping Zhang

Dual-Structure Self-Distilled Learning for Enhancing Unsupervised Semantic Segmentation

Jing Luo, Xiaoliu Luo, Mengzhu Wang, Zuotao Fu, Xu Wang, Taiping Zhang

16 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Dual-Structure, Self-Distilled Learning, Unsupervised Semantic Segmentation

TL;DR: a novel framework that performs self-distillation within a single network by transferring the stronger semantic representations learned in deeper layers to guide shallower layers, without relying on external teachers.

Abstract: Unsupervised semantic segmentation (USS) aims to assign semantic labels to pixels without human annotations, yet existing methods struggle to capture semantic structures across different abstraction levels. We propose Dual-Structure Self-Distilled Learning (DSSDL), a novel framework that performs self-distillation within a single network by transferring the stronger semantic representations learned in label space to guide shallower layer, without relying on external teachers. DSSDL integrates two complementary structures:(1) an affinity structure that performs binary pair classification over pairwise similarity scores and leverages a reversed directional mining strategy to preserve fine-grained local consistency.(2) a cluster structure that derives semantic codes from global prototypes and aligns per-pixel predictions via a swapped prediction loss to encourage consistent global grouping. By jointly modeling both structures, DSSDL enforces semantic consistency at both local and global levels, resulting in coherent and robust segmentations. Our method achieves substantial improvements over the strong baseline STEGO, with accuracy and mIoU gains of +16.7 and +3.3 on COCO-Stuff, +14.8 and +3.2 on Cityscapes, and +8.2 and +11.5 on Potsdam-3, respectively.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 6630

Loading