Scalable Fuzzy Clustering With Collaborative Structure Learning and Preservation

Published: 01 Jan 2025, Last Modified: 04 Nov 2025IEEE Trans. Fuzzy Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: To partition samples into distinct clusters, Fuzzy C-Means (FCM) calculates the membership degrees of samples to cluster centers and provides soft labels, gaining significant attention in recent years. However, existing FCM methods encounter the following challenges. First, traditional FCM focuses on learning membership degrees, neglecting the data similarity structures. Second, graph-based FCM typically separates graph construction from clustering, overlooking the knowledge interaction between graphs and clustering, obtaining suboptimal performance. Third, exploring the similarity structures among all samples is computationally expensive for large-scale tasks. To solve these dilemmas, we propose a scalable fuzzy clustering with collaborative structure learning and preservation (CSLP), which simultaneously leverages both cluster information and similarity structures to learn an optimal membership degree representation. Specifically, a self-weighted manner is devised to measure the sample importance, thereby reducing the adverse impacts of outliers. Moreover, the graph is updated according to the data similarities in the membership degree representation, such that CSLP collaboratively learns the graph and membership degrees in a mutually reinforcing manner. Thus, the similarity structures are fully explored during clustering processes and preserved in the learned membership degrees, enhancing the discrimination of clustering labels. To further improve efficiency, an acceleration solution is developed to reduce the computational cost of CSLP by propagating membership degrees from potential centers to samples, making CSLP scalable for large-scale tasks. An iterative strategy is designed to solve the formulated objective function. Extensive experiments demonstrate that CSLP outperforms other fuzzy clustering methods in terms of both effectiveness and scalability.
Loading