Estimating the Clustering Coefficient Using Sample Complexity Analysis

Alane M. de Lima; Murilo V. G. da Silva; André Luís Vignatti

Estimating the Clustering Coefficient Using Sample Complexity Analysis

Alane M. de Lima, Murilo V. G. da Silva, André Luís Vignatti

Published: 01 Jan 2022, Last Modified: 05 Feb 2025LATIN 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this work we present a sampling algorithm for estimating the local clustering of each vertex of a graph. Let G be a graph with n vertices, m edges, and maximum degree \(\varDelta \). We present an algorithm that, given G and fixed constants \(0< \varepsilon , \delta , p < 1\), outputs the values for the local clustering coefficient within \(\varepsilon \) error with probability \(1 - \delta \), for every vertex v of G, provided that the (exact) local clustering of v is not “too small.” We use VC dimension theory to give a bound for the number of edges required to be sampled by the algorithm. We show that the algorithm runs in time \(\mathcal {O}(\varDelta \lg \varDelta + m)\). We also show that the running time drops to, possibly, sublinear time if we restrict G to belong to some well-known graph classes. In particular, for planar graphs the algorithm runs in time \(\mathcal {O}(\varDelta )\). In the case of bounded-degree graphs the running time is \(\mathcal {O}(1)\) if a bound for the value of \(\varDelta \) is given as a part of the input, and \(\mathcal {O}(n)\) otherwise.

Loading