Structural graph clustering on signed graphs: An index-based approach

Published: 01 Jan 2025, Last Modified: 11 Apr 2025Inf. Sci. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Structural graph clustering (SCAN<math><mi mathvariant="sans-serif" is="true">SCAN</mi></math>) is a foundational graph analysis task about managing and profiling graph datasets. Existing work on structural graph clustering all focus on unsigned graphs, and the SCAN<math><mi mathvariant="sans-serif" is="true">SCAN</mi></math> methods are not applicable to signed graphs that carry friendly and antagonistic relationships. To tackle this problem, in this paper, we investigate a novel structural graph clustering model named SSCAN<math><mi mathvariant="sans-serif" is="true">SSCAN</mi></math>. Based on SSCAN<math><mi mathvariant="sans-serif" is="true">SSCAN</mi></math>, we devise an index structure, SSCAN-Index+<math><mi mathvariant="sans-serif" is="true">SSCAN</mi><mtext mathvariant="sans-serif" is="true">-</mtext><mi mathvariant="sans-serif" is="true">Inde</mi><msup is="true"><mrow is="true"><mi mathvariant="sans-serif" is="true">x</mi></mrow><mrow is="true"><mo linebreak="badbreak" linebreakstyle="after" is="true">+</mo></mrow></msup></math>, which stores information about cores and structural similarities. The space complexity of SSCAN-Index+<math><mi mathvariant="sans-serif" is="true">SSCAN</mi><mtext mathvariant="sans-serif" is="true">-</mtext><mi mathvariant="sans-serif" is="true">Inde</mi><msup is="true"><mrow is="true"><mi mathvariant="sans-serif" is="true">x</mi></mrow><mrow is="true"><mo linebreak="badbreak" linebreakstyle="after" is="true">+</mo></mrow></msup></math> can be well bounded by O(m)<math><mi is="true">O</mi><mo stretchy="false" is="true">(</mo><mi is="true">m</mi><mo stretchy="false" is="true">)</mo></math>, where m is the number of edges in a given signed graph. Following our index, we propose an index-based query method, SSCAN-Query+<math><mi mathvariant="sans-serif" is="true">SSCAN</mi><mtext mathvariant="sans-serif" is="true">-</mtext><mi mathvariant="sans-serif" is="true">Quer</mi><msup is="true"><mrow is="true"><mi mathvariant="sans-serif" is="true">y</mi></mrow><mrow is="true"><mo linebreak="badbreak" linebreakstyle="after" is="true">+</mo></mrow></msup></math>, aiming to reduce the number of expensive structural similarity computations and it can response the query in O(∑u∈V|N[⁎]|≥μ|N[u]|)<math><mi is="true">O</mi><mo stretchy="false" is="true">(</mo><msub is="true"><mrow is="true"><mo is="true">∑</mo></mrow><mrow is="true"><mi is="true">u</mi><mo is="true">∈</mo><msub is="true"><mrow is="true"><mi is="true">V</mi></mrow><mrow is="true"><mo stretchy="false" is="true">|</mo><mi is="true">N</mi><mo stretchy="false" is="true">[</mo><mo is="true">⁎</mo><mo stretchy="false" is="true">]</mo><mo stretchy="false" is="true">|</mo><mo is="true">≥</mo><mi is="true">μ</mi></mrow></msub></mrow></msub><mo stretchy="false" is="true">|</mo><mi is="true">N</mi><mo stretchy="false" is="true">[</mo><mi is="true">u</mi><mo stretchy="false" is="true">]</mo><mo stretchy="false" is="true">|</mo><mo stretchy="false" is="true">)</mo></math> time complexity. We also extend our techniques to support cluster-group-by queries problem that can distinguish which vertices in S belong to the same cluster in a given subset S⊆V<math><mi is="true">S</mi><mo is="true">⊆</mo><mi is="true">V</mi></math>. This enables our model to be effectively extended to large-scale signed graphs such as trust network and user evaluation dataset to find closely connected and structurally balanced communities. Furthermore, we propose index maintenance algorithms with effective pruning techniques for updating the clusters when the input signed graph dynamically changes in O(∑w∈N(u)∪N(v)log⁡n)<math><mi is="true">O</mi><mo stretchy="false" is="true">(</mo><msub is="true"><mrow is="true"><mo is="true">∑</mo></mrow><mrow is="true"><mi is="true">w</mi><mo is="true">∈</mo><mi is="true">N</mi><mo stretchy="false" is="true">(</mo><mi is="true">u</mi><mo stretchy="false" is="true">)</mo><mo is="true">∪</mo><mi is="true">N</mi><mo stretchy="false" is="true">(</mo><mi is="true">v</mi><mo stretchy="false" is="true">)</mo></mrow></msub><mi mathvariant="normal" is="true">log</mi><mo is="true">⁡</mo><mi is="true">n</mi><mo stretchy="false" is="true">)</mo></math> time complexity. Extensive experimental tests on eight real signed graphs demonstrate the effectiveness of our new clustering model and the efficiency of our proposed methods.
Loading