Efficient Computation for Diagonal of Forest Matrix via Variance-Reduced Forest Sampling

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24 OralEveryoneRevisionsBibTeX
Keywords: Forest matrix, Wilson's algorithm, spanning converging forest, variance reduction
TL;DR: We study the problem of efficient computation for diagonal of the forest matrix.
Abstract: The forest matrix, particularly its diagonal elements, has far-reaching implications in network science and machine learning. The state-of-the-art algorithms for the diagonal of forest matrix computation are based on a fast Laplacian solver. However, these algorithms encounter limitations when applied to digraphs due to the incapacity of the Laplacian solver. To overcome the issue, in this paper, we propose three novel sampling-based algorithms: SCF, SCFV, and SCFV. Our first algorithm SCF leverages a probability interpretation of the diagonal of the forest matrix and utilizes an expansion of Wilson's algorithm to sample spanning converging forests. To reduce the variance in the forest sampling, we develop two novel variance-reduced techniques. The first technique, leading to the proposal of the SCFV algorithm, is inspired by opinion dynamics in graphs and applies matrix-vector iteration to the spanning forest sampling. While SCFV achieves reduced variance compared to SCF, the cross-product term in its variance expression can be complex and potentially large in certain graphs. Therefore, we develop another technique, leading to a new iteration equation and the SCFV+ algorithm. SCFV+ achieves further reduced variance without the cross-product term in the variance of SCFV. We prove that SCFV+ can achieve a relative error guarantee with high probability and maintain a linear time complexity relative to the nodes of graphs, presenting a superior theoretical result compared to state-of-the-art algorithms. Finally, we conduct extensive experiments on various real-world networks, showing that our algorithms achieve better estimation accuracy and are more time-efficient than the state-of-the-art algorithms. Moreover, our algorithms are scalable to massive graphs with more than twenty million nodes in both undirected and directed graphs.
Track: Graph Algorithms and Learning for the Web
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: Yes
Submission Number: 1568
Loading