Abstract: The real-world data is relatively complex, generally formed by the interaction of different latent factors. Disentanglement of these latent factors can effectively improve the robustness and interpretability of sample representation. However, most existing disentangled multi-view clustering methods focus on the irrelevance of disentangled representations, ignoring the semantic relevance invariance between different latent factors. To address this issue, we propose a disentangled contrastive multi-view clustering via semantic relevance invariance (DMVCS) to learn the disentangled representations and maintain their semantic relevance. Specifically, we first decompose each view into consistent and specific representations by maximizing semantic consistency and minimizing the correlation between multiple views. Meanwhile, to ensure that different disentangled representations have similar semantic relevance, a cross-component semantic relevance alignment module is proposed. Combined with the hierarchical sampling strategy, the learned semantic relevances are aligned progressively in a locally structure-aware manner. Besides, to learn a clustering-friendly unified representation, we propose a multi-hop neighbor contrastive learning to extend the range of positive samples. Comprehensive experiments on ten public multi-view datasets demonstrate that DMVCS outperforms the state-of-the-art clustering methods.
Loading