Toward Generalized Multistage Clustering: Multiview Self-Distillation

Jiatai Wang, Zhiwei Xu, Xin Wang, Tao Li

Published: 01 Jan 2025, Last Modified: 26 Jul 2025IEEE Trans. Neural Networks Learn. Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Existing multistage clustering methods independently learn the salient features from multiple views and then perform the clustering task. Particularly, multiview clustering (MVC) has attracted a lot of attention in multiview or multimodal scenarios. MVC aims at exploring common semantics and pseudo-labels from multiple views and clustering in a self-supervised manner. However, limited by noisy data and inadequate feature learning, such a clustering paradigm generates overconfident pseudo-labels that misguide the model to produce inaccurate predictions. Therefore, it is desirable to have a method that can correct this pseudo-label mistraction in multistage clustering to avoid bias accumulation. To alleviate the effect of overconfident pseudo-labels and improve the generalization ability of the model, this article proposes a novel multistage deep MVC framework where multiview self-distillation (DistilMVC) is introduced to distill dark knowledge of label distribution. Specifically, in the feature subspace at different hierarchies, we explore the common semantics of multiple views through contrastive learning and obtain pseudo-labels by maximizing the mutual information between views. Additionally, a teacher network is responsible for distilling pseudo-labels into dark knowledge, supervising the student network and improving its predictive capabilities to enhance its robustness. Extensive experiments on real-world multiview datasets show that our method has better clustering performance than the state-of-the-art (SOTA) methods.