Dissecting graph measures performance for node clustering in LFR parameter space

Vladimir Ivashkin; Pavel Chebotarev

Dissecting graph measures performance for node clustering in LFR parameter space

Vladimir Ivashkin, Pavel Chebotarev

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: graph theory, graph measures, kernel k-means, clustering

Abstract: Graph measures can be used for graph node clustering using metric clustering algorithms. There are multiple measures applicable to this task, and which one performs better is an open question. We study the performance of 25 graph measures on generated graphs with different parameters. While usually measure comparisons are limited to general measure ranking on a particular dataset, we aim to explore the performance of various measures depending on graph features. Using an LFR generator, we create a dataset of ~7500 graphs covering the whole LFR parameter space. For each graph, we assess the quality of clustering with k-means algorithm for every considered measure. We determine the best measure for every area of the parameter space. We find that the parameter space consists of distinct zones where one particular measure is the best. We analyze the geometry of the resulting zones and describe it with simple criteria. Given particular graph parameters, this allows us to choose the best measure to use for clustering.

One-sentence Summary: We investigated graph features space and found zones of leadership for several graph node measures in node clustering task.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=gqGLiOmU5A

12 Replies

Loading