Track: Graph algorithms and modeling for the Web
Keywords: Graph Clustering, Graph Diffusion, Graph Neural Network, Heterophily Graph
Abstract: Clustering over a graph seeks to partition the nodes therein into disjoint groups such that nodes within the same cluster are tightly-knit, while those across clusters are distant from each other. In practice, graphs are often attended with rich attributes, which are termed attributed graphs. By leveraging the complementary nature of graph topology and node attributes in such graphs, graph neural networks (GNNs) have obtained encouraging performance in graph clustering. However, existing GNN-based approaches strongly rely on the homophilic assumption of the input graph, and thus, largely fail on heterophilic graphs and others embodying numerous missing or noisy links, which are widely present in real life.
To bridge this gap, this paper presents DGAC, an effective graph-agnostic solution for graph clustering. Particularly, DGAC overcomes the limitations of prior works by exploiting the high-order connectivity of nodes within not only the input graph G but also the affinity graph H underlying the attribute data. To achieve this goal, we first unify the embedding and clustering generations into a coherent framework that optimizes Dirichlet Energy on both G and H. Based thereon, theoretical-grounded solvers are developed for efficient constructions of the embeddings and clusters, which capture high-order semantics from G or H via graph diffusion. On top of that, DGAC includes three optimization loss functions that facilitate effective feature extraction and clustering. Extensive experiments, comparing DGAC against 12 baselines over 12 homophilic or heterophilic graph datasets, showcase that DGAC consistently and considerably outperforms all competitors in terms of clustering quality measured against ground truth labels.
Submission Number: 1776
Loading