Bridged Clustering for Representation Learning: Semi-Supervised Sparse Bridging

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Semi-supervised Learning, Clustering
Abstract: We introduce Bridged Clustering, a semi-supervised framework to learn predictors from any unpaired input $\mathcal{X}$ and output $\mathcal{Y}$ dataset. Our method first clusters $\mathcal{X}$ and $\mathcal{Y}$ independently, then learns a sparse, interpretable bridge between clusters using only a few paired examples. At inference, a new input $x$ is assigned to its nearest input cluster, and the centroid of the linked output cluster is returned as the prediction $\hat{y}$. Unlike traditional SSL, Bridged Clustering explicitly leverages output-only data, and unlike dense transport-based methods, it maintains a sparse and interpretable alignment. Through theoretical analysis, we show that with bounded mis-clustering and mis-bridging rates, our algorithm becomes an effective and efficient predictor. Empirically, our method is competitive with SOTA methods while remaining simple, model-agnostic, and highly label-efficient in low-supervision settings.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 14205
Loading