Effectively Refining Parameters for Unexpected Nodes on Structural Graph Clustering

Chuanyu Zong, Chengwei Zhang, Xiufeng Xia, Tao Qiu, Jiaying Wang

Published: 01 Jan 2020, Last Modified: 06 Feb 2025ISPA/BDCloud/SocialCom/SustainCom 2020EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In the applications of analyzing graph data, SCAN algorithm is an effective clustering algorithm for detecting meaningful clusters, which is widely used in many different graph applications. The problem of refining structural graph clustering parameters for unexpected node is to explain why an unexpected node is included in the specified cluster of the SCAN results and how to make the unexpected node disappear in the specified cluster. It is obvious that the SCAN results are very sensitive to the two clustering parameters, one is the similarity threshold ε, the other one is the density constraint μ, when they are input unreasonable, some unexpected nodes would be included in the specified clusters. To address this problem, how the parameters affect the scan results is analyzed firstly, then two effective refining algorithms for making the unexpected vertices disappear in the specified cluster are proposed, which optimize the initial SCAN parameters with minimum penalty from two aspects: one is to refine the parameter ε; and the other is to refine the parameter μ. Moreover, to retain the original SCAN results as much as possible in the refined SCAN results, one penalty function is proposed. Finally, comprehensive experiments on real datasets show that our refining model can efficiently refine clustering parameters for the unexpected nodes of the clustering results of SCAN.