Abstract: Deep graph clustering(DGC), which seeks to partition nodes into disconnected clusters through clustering algorithms, remains a challenging research task. Recent advancements have drawn significant attention to reliable clustering algorithms due to their promising performance. However, due to the limitation of clustering algorithms, existing methods for DGC cannot effectively mine hard pseudo label as high-confidence pseudo label, thereby neglecting the crucial samples. To solve the above problem, this paper proposes a Distance-guided Pseudo Label Graph Clustering Network(DPLC) by introducing a comprehensive hard node measure criterion to improve the clustering performance. Specifically, DPLC first divides the nodes into disconnected clusters based on clustering algorithm. Then, this network uses carefully selected clustering nodes as high-confidence pseudo label, aiming the network for the feature distribution alignment can be more accurate. Additionally, different from most DGCs only use contrastive learning, we optimize high similarities between disconnected nodes which may have higher similarity than the connected nodes in the graph representation, preventing irrational message-passing and then preserving structural information. Extensive experiments on five public datasets demonstrate the superiority of DPLC against competitors.
Loading