PROTOCOL: Partial Optimal Transport-enhanced Contrastive Learning for Imbalanced Multi-view Clustering
TL;DR: This paper addresses the critical challenge of class imbalance in multi-view clustering by formulating a partial optimal transport problem and class-rebalanced contrastive learning.
Abstract: While contrastive multi-view clustering has achieved remarkable success, it implicitly assumes balanced class distribution.
However, real-world multi-view data primarily exhibits class imbalance distribution. Consequently, existing methods suffer performance degradation due to their inability to perceive and model such imbalance. To address this challenge, we present the first systematic study of imbalanced multi-view clustering, focusing on two fundamental problems: *i. perceiving class imbalance distribution*, and *ii. mitigating representation degradation of minority samples*. We propose PROTOCOL, a novel PaRtial Optimal TranspOrt-enhanced COntrastive Learning framework for imbalanced multi-view clustering. First, for class imbalance perception, we map multi-view features into a consensus space and reformulate the imbalanced clustering as a partial optimal transport (POT) problem, augmented with *progressive mass constraints* and *weighted KL divergence* for class distributions. Second, we develop a POT-enhanced class-rebalanced contrastive learning at both feature and class levels, incorporating *logit adjustment* and *class-sensitive learning* to enhance minority sample representations. Extensive experiments demonstrate that PROTOCOL significantly improves clustering performance on imbalanced multi-view data, filling a critical research gap in this field.
Lay Summary: In real-world scenarios, multi-source data often exhibits class imbalance, making it difficult for existing multi-view clustering methods—which often implicitly assume balanced datasets—to effectively perceive and model this issue. To address this challenge, we propose PROTOCOL, a novel PaRtial Optimal TranspOrt-enhanced COntrastive Learning framework that can both perceive class imbalance in multi-view data and mitigate the representation degradation of minority samples. Extensive experimental results demonstrate that PROTOCOL consistently achieves outstanding performance across various imbalance ratios, providing more reliable technical support for data mining in practical fields such as healthcare and sensor networks.
Link To Code: https://github.com/Scarlett125/PROTOCOL
Primary Area: General Machine Learning->Clustering
Keywords: multi-view clustering, class-imbalanced learning, unbalanced optimal transport, partial optimal transport
Submission Number: 6802
Loading