Multi-Constrained Graph Pattern Matching in large-scale contextual social graphsDownload PDFOpen Website

Published: 2015, Last Modified: 15 May 2023ICDE 2015Readers: Everyone
Abstract: Graph Pattern Matching (GPM) plays a significant role in social network analysis, which has been widely used in, for example, experts finding, social community mining and social position detection. Given a pattern graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Q</sub> and a data graph G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">D</sub> , a GPM algorithm finds those subgraphs, G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</sub> , that match G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Q</sub> in G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">D</sub> . However, the existing GPM methods do not consider the multiple constraints on edges in G <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Q</sub> , which are commonly exist in various applications such as, crowdsourcing travel, social network based e-commerce and study group selection, etc. In this paper, we first conceptually extend Bounded Simulation to Multi-Constrained Simulation (MCS), and propose a novel NP-Complete Multi-Constrained Graph Pattern Matching (MC-GPM) problem. Then, to address the efficiency issue in large-scale MC-GPM, we propose a new concept called Strong Social Component (SSC), consisting of participants with strong social connections. We also propose an approach to identify SSCs, and propose a novel index method and a graph compression method for SSC. Moreover, we devise a heuristic algorithm to identify MC-GPM results effectively and efficiently without decompressing graphs. An extensive empirical study on five real-world large-scale social graphs has demonstrated the effectiveness, efficiency and scalability of our approach.
0 Replies

Loading