Similarity-Enhanced Homophily for Multi-View Heterophilous Graph Clustering

TMLR Paper3437 Authors

04 Oct 2024 (modified: 26 Oct 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: With the increasing prevalence of graph-structured data, multi-view graph clustering has become a fundamental technique in various applications. While existing methods often employ a unified message passing mechanism to enhance clustering performance, this approach is less effective in heterophilous scenarios, where nodes with dissimilar features are connected. Our experiments demonstrate this by showing the degraded clustering performance as the heterophilous ratio increases. To address this limitation, a natural method is to conduct specific graph filters for graphs with specific homophilous ratio. However, this is inappropriate for unsupervised tasks due to the unavailable labels and homophilous ratios. Alternatively, we start from an observation showing that the implicit homophilous information may exist in similarity matrices even when the graph is heterophilous. Based on this observation, we explore a strategy that does not require prior knowledge of the homophilous or heterophilous, proposing a novel data-centric unsupervised learning framework, namely SiMilarity-enhanced Homophily for Multi-view Heterophilous Graph Clustering (SMHGC). By analyzing the relationship between similarity and graph homophily, we propose to enhance the homophily by introducing three similarity terms, i.e., neighbor pattern similarity, node feature similarity, and multi-view global similarity, in a label-free manner. Then, a consensus-based inter- and intra-view fusion paradigm is proposed to fuse the improved homophilous graph from different views and utilize them for clustering. The state-of-the-art experimental results on both multi-view heterophilous and homophilous datasets highlight the effectiveness of using similarity for unsupervised multi-view graph learning, even in heterophilous settings. Furthermore, the consistent performance across semi-synthetic datasets with varying levels of homophily serves as further evidence of SMHGC's resilience to heterophily.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Correct the title capitalization.
Assigned Action Editor: ~Bo_Dai1
Submission Number: 3437
Loading