QUINT: Node Embedding Using Network Hashing

Debajyoti Bera, Rameshwar Pratap, Bhisham Dev Verma, Biswadeep Sen, Tanmoy Chakraborty

Published: 01 Jan 2023, Last Modified: 12 May 2023IEEE Trans. Knowl. Data Eng. 2023Readers: Everyone

Abstract: Representation learning using network embedding has received tremendous attention due to its efficacy to solve downstream tasks. Popular embedding methods (such as <monospace>deepwalk</monospace> , <monospace>node2vec</monospace> , <monospace>LINE</monospace> ) are based on a neural architecture, thus unable to scale on large networks both in terms of time and space usage. Recently, we proposed <inline-formula><tex-math notation="LaTeX">$\mathrm{BinSketch}$</tex-math></inline-formula> , a sketching technique for compressing binary vectors to binary vectors. In this paper, we show how to extend <inline-formula><tex-math notation="LaTeX">$\mathrm{BinSketch}$</tex-math></inline-formula> and use it for network hashing. Our proposal named <monospace>QUINT</monospace> is built upon <inline-formula><tex-math notation="LaTeX">$\mathrm{BinSketch}$</tex-math></inline-formula> , and it embeds nodes of a sparse network onto a low-dimensional space using simple bit-wise operations. <monospace>QUINT</monospace> is the first of its kind that provides tremendous gain in terms of speed and space usage without compromising much on the accuracy of the downstream tasks. Extensive experiments are conducted to compare <monospace>QUINT</monospace> with seven state-of-the-art network embedding methods for two end tasks – link prediction and node classification. We observe huge performance gain for <monospace>QUINT</monospace> in terms of speedup (up to <inline-formula><tex-math notation="LaTeX">$7000\times$</tex-math></inline-formula> ) and space saving (up to <inline-formula><tex-math notation="LaTeX">$800\times$</tex-math></inline-formula> ) due to its bit-wise nature to obtain node embedding. Moreover, <monospace>QUINT</monospace> is a consistent top-performer for both the tasks among the baselines across all the datasets. Our empirical observations are backed by rigorous theoretical analysis to justify the effectiveness of <monospace>QUINT</monospace> . In particular, we prove that <monospace>QUINT</monospace> retains enough structural information which can be used further to approximate many topological properties of networks with high confidence.

0 Replies