Abstract: To infer the underlying diffusion network, most existing approaches are almost based on an initial potential edge set constructed according to the observed data (i.e., the infection times of nodes) to infer the diffusion edges. Nevertheless, there are relatively few studies that combine the infection times and infection statuses of nodes to preprocess the edge set so as to improve the accuracy and efficiency of network inference. To bridge the gap, this paper proposes a two-stage inference algorithm, namely, Clustering-based Network Inference with Submodular Maximization (CNISM). In the first stage, based on a well-designed metric that fuses the infection times and infection statuses of nodes, we firstly fast infer effective candidate edges from the initial candidate edge set by clustering, then capture the cluster structures of nodes according to the effective candidate edges, which is helpful for the inference of subsequent algorithm. In the second stage, the cluster structures of nodes are integrated into MulTree, which is a submodular maximization algorithm based on multiple trees, to infer the topology of the diffusion network. Experimental results on both synthetic and real-world networks show that compared with the comparative algorithms, our framework is generally superior to them in terms of inference accuracy with a low computational cost.
Loading