On Efficient Single-Source Personalized PageRank Computation in Online Social Networks

Victor Junqiu Wei, Di Jiang, Chen Zhang

Published: 01 Jan 2025, Last Modified: 25 Aug 2025IEEE Trans. Knowl. Data Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The Single-Source Personalized PageRank (SSPPR) problem is widely used in information retrieval and recommendation systems. Traditional algorithms assume full knowledge of the network, making them inapplicable to online social networks (OSNs), where the topology is unknown, and users can only explore the network step by step via APIs. The only feasible approach for SSPPR in OSNs is Monte Carlo (MC) simulation, but traditional MC methods rely on static sampling, which lacks flexibility, delays feedback, and overestimates the number of required random walks. To address these limitations, we propose PANDA (Single-Source Personalized PageRank on OSNs with Rademacher Average), a progressive sampling algorithm. PANDA iteratively samples random walks in batches, estimating accuracy dynamically using Rademacher Average from statistical learning theory. This data-dependent approach allows for early termination once the desired accuracy is met. Additionally, PANDA features a dynamic sampling schedule to optimize efficiency. Empirical studies show that PANDA significantly outperforms existing methods, achieving the same accuracy with far greater efficiency.