Time- and Space-Efficiently Sketching Billion-Scale Attributed Networks

Wei Wu, Shiqi Li, Mi Jiang, Chuan Luo, Fangfang Li

Published: 2025, Last Modified: 22 Jan 2026IEEE Trans. Knowl. Data Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Attributed network embedding seeks to depict each network node via a compact, low-dimensional vector while effectively preserving the similarity between node pairs, which lays a strong foundation for a great many high-level network mining tasks. With the advent of the era of Big Data, the number of nodes and edges has reached billions in many real-world networks, which poses great computational and storage challenges to the existing methods. Although some algorithms have been developed to handle billion-scale networks, they often undergo accuracy degradation or tempo-spatial inefficiency owing to attribute information loss or substantial parameter learning. To this end, we propose a simple, time- and space-efficient billion-scale attributed network embedding algorithm called SketchBANE in this paper, which strikes an excellent balance between accuracy and efficiency by adopting sparse random projection with 1-bit quantization to sketch the iterative closed neighborhood and maintain the similarity among high-order nodes in a non-learning manner. The extensive experimental results indicate that our proposed SketchBANE algorithm competes favorably with the state-of-the-art approaches, while remarkably reducing runtime and space consumption. Also, the proposed SketchBANE algorithm exhibits good scalability and parallelization.

External IDs:dblp:journals/tkde/WuLJLL25