Hyperspherical Prototype Node Clustering

Published: 22 Jan 2024, Last Modified: 22 Jan 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: The general workflow of deep node clustering is to encode the nodes into node embeddings via graph neural networks and uncover clustering decisions from them, so clustering performance is heavily affected by the embeddings. However, existing works only consider preserving the semantics of the graph but ignore the inter-cluster separability of the nodes, so there's no guarantee that the embeddings can present a clear clustering structure. To remedy this deficiency, we propose Hyperspherical Prototype Node Clustering (HPNC), an end-to-end clustering paradigm that explicitly enhances the inter-cluster separability of learned node embeddings. Concretely, we constrain the embedding space to a unit-hypersphere, enabling us to scatter the cluster prototypes over the space with maximized pairwise distances. Then, we employ a graph autoencoder to map nodes onto the same hypersphere manifold. Consequently, cluster affinities can be directly retrieved from cosine similarities between node embeddings and prototypes. A clustering-oriented loss is imposed to sharpen the affinity distribution so that the learned node embeddings are encouraged to have small intra-cluster distances and large inter-cluster distances. Based on the proposed HPNC paradigm, we devise two schemes (HPNC-IM and HPNC-DEC) with distinct clustering backbones. Empirical results on popular benchmark datasets demonstrate the superiority of our method compared to other state-of-the-art clustering methods, and visualization results illustrate improved separability of the learned embeddings.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Regular submission (no more than 12 pages of main content)
Code: https://github.com/MoetaYuko/HPNC
Assigned Action Editor: ~Guillaume_Rabusseau1
Submission Number: 1564