Abstract: PathSim is a widely used meta-path-based similarity in heterogeneous information networks (HINs). Numerous applications rely on the computation of PathSim, including similarity search and clustering. Computing PathSim scores on large HINs is computationally challenging due to its high time and storage complexity. In this paper, we propose to transform the computation of the ground truth PathSim into a learning problem. We design an encoder-decoder based framework, NeuPath, where the algorithmic structure of PathSim is considered. NeuPath has two variants: (1) The static variant learns PathSim scores between two nodes in a static heterogeneous graph. Specifically, the encoder module identifies Top T optimized path instances, which can approximate the ground truth PathSim. (2) The temporal variant predicts the PathSim scores for a pair of nodes at a certain timestamp. Here, the encoder weighs neighbors with the constraint of a specific time encoding on every edge. The decoder transforms each embedding vector into a scalar respectively, which identifies the similarity score. We perform extensive experiments on real-world datasets from different domains. Our results demonstrate that NeuPath performs better than state-of-the-art baselines in the PathSim approximation task and similarity search task in both static and temporal HINs.
Loading