Calibrated Information Bottleneck for Trusted Multi-modal Clustering

Published: 26 Jan 2026, Last Modified: 26 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-modal Clustering, Information Bottleneck
TL;DR: This paper proposes a novel CaLibrated Information Bottleneck (CLIB) framework for trusted multi-modal clustering.
Abstract: Information Bottleneck (IB) Theory is renowned for its ability to learn simple, compact, and effective data representations. In multi-modal clustering, IB theory effectively eliminates interfering redundancy and noise from multi-modal data, while maximally preserving the discriminative information. Existing IB-based multi-modal clustering methods suffer from low-quality pseudo-labels and over-reliance on accurate Mutual Information (MI) estimation, which is known to be challenging. Moreover, unreliable or noisy pseudo-labels may lead to an overconfident clustering outcome. To address these challenges, this paper proposes a novel CaLibrated Information Bottleneck (CLIB) framework designed to learn a clustering that is both accurate and trustworthy. We build a parallel multi-head network architecture—incorporating one primary cluster head and several modality-specific calibration heads—which achieves three key goals: namely, calibrating for the distortions introduced by biased MI estimation thus improving the stability of IB, constructing reliable target variables for IB from multiple modalities and producing a trustworthy clustering result. Notably, we design a dynamic pseudo-label selection strategy based on information redundancy theory to extract high-quality pseudo-labels, thereby enhancing training stability. Experimental results demonstrate that our model not only achieves competitive clustering accuracy on multiple benchmark datasets but also exhibits excellent performance on the expected calibration error metric. Code is available at \textcolor{red}{https://shizhehu.github.io/}.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 4840
Loading