propagate: A Seed Propagation Framework to Compute Distance-Based Metrics on Very Large Graphs

Giambattista Amati, Antonio Cruciani, Daniele Pasquini, Paola Vocca, Simone Angelini

Published: 2023, Last Modified: 24 Mar 2026ECML/PKDD (3) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We propose propagate, a fast approximation framework to estimate distance-based metrics on very large graphs such as: the (effective) diameter or the average distance within a small error. The framework assigns seeds to nodes and propagates them in a BFS-like fashion, computing the neighbors set until we obtain either the whole vertex set (for computing the diameter) or a given percentage of vertices (for the effective diameter). At each iteration, we derive compressed Boolean representations of the neighborhood sets discovered so far. The propagate framework yields two algorithms: propagate-p, which propagates all the \(s\) seeds in parallel, and propagate-s which propagates the seeds sequentially. For each node, the compressed representation of the propagate-p algorithm requires \(s\) bits while propagate-s 1 bit only. Both algorithms compute the average distance, the effective diameter, the diameter, and the connectivity rate (a measure of the sparseness degree of the transitive closure graph) within a small error with high probability: for any \(\varepsilon >0\) and using \(s=\varTheta \left( \frac{\log n}{\varepsilon ^2}\right) \) sample nodes, the error for the average distance is bounded by \(\xi = \frac{\varepsilon \varDelta }{\alpha }\); the errors for the effective diameter and the diameter are bounded by \(\xi = \frac{\varepsilon }{\alpha }\); and the error for the connectivity rate is bounded by \(\varepsilon \) where \(\varDelta \) is the diameter and \(\alpha \) is the connectivity rate. The time complexity of our approaches is \(\mathcal {O}(\varDelta \cdot m)\) for propagate-pand \(\mathcal {O}\left( \frac{\log n}{\varepsilon ^2}\cdot \varDelta \cdot m\right) \) for propagate-s, where m is the number of edges of the graph and \(\varDelta \) is the diameter. The experimental results show that the propagate framework improves the current state of the art in accuracy, speed, and space. Moreover, we experimentally show that propagate is also very efficient for solving the All Pair Shortest Path problem in very large graphs.
Loading