Abstract: Multi-view clustering (MVC) leverages complementary information from diverse data sources to enhance clustering performance. However, its practical deployment in distributed and privacy-sensitive scenarios remains challenging. Federated multi-view clustering (FMVC) has emerged as a potential solution, but existing approaches suffer from substantial limitations, including excessive communication overhead, insufficient privacy protection, and inadequate handling of missing views. To address these issues, we propose Efficient Federated Incomplete Multi-View Clustering (EFIMVC), a novel framework that introduces a localized optimization strategy to significantly reduce communication costs while ensuring theoretical convergence. EFIMVC employs both view-specific and shared anchor graphs as communication variables, thereby enhancing privacy by avoiding the transmission of sensitive embeddings. Moreover, EFIMVC seamlessly extends to scenarios with missing views, making it a practical and scalable solution for real-world applications. Extensive experiments on benchmark datasets demonstrate the superiority of EFIMVC in clustering accuracy, communication efficiency, and privacy preservation. Our code is publicly available at https://github.com/Tracesource/EFIMVC.
Lay Summary: Many real-world applications collect data from different sources (or “views”), like text, images, or sensor signals. Grouping such data without labels—called multi-view clustering—can reveal meaningful patterns. But in privacy-sensitive settings like healthcare or finance, this becomes tricky: data is often stored separately (due to privacy), some views may be missing, and communication between locations is costly. We address these challenges by developing EFIMVC, a new federated multi-view clustering method. Instead of sharing sensitive data or complex models, EFIMVC only communicates simple graph-based summaries, preserving privacy and cutting down on communication. It also works well when some views are missing—something most existing methods can’t handle. Our method is theoretically sound, practical, and highly accurate. It’s a step toward more efficient, privacy-aware learning from complex, distributed data.
Link To Code: https://github.com/Tracesource/EFIMVC
Primary Area: General Machine Learning->Clustering
Keywords: Multi-view Clustering, Incomplete Multi-view Clustering, Federated Learning
Submission Number: 3565
Loading