Abstract: Multi-view clustering (MVC) can explore common semantics from multiple views and has been extensively used to support management with unsupervised training data. However, the issue of spatio-temporal asynchronism often leads to multi-view data being missing or unaligned in the real world. This limit poses significant challenges in learning consistent representations. This paper proposes a deep MVC framework where data recovery and alignment are fused hierarchically from an information-theoretic perspective, maximizing the mutual information among different views and ensuring the consistency of their latent spaces. To address the issue of missing views, we use dual prediction for instance-level alignment. While leveraging contrastive reconstruction enhances the mutual information of features within the same class for class-level alignment. This could be the first attempt to view recovery and alignment can be solved simultaneously in a unified theoretical framework. Extensive experiments show that our method outperforms baseline methods even in the cases of missing and unaligned views.
Loading