Abstract: In the data mining field, community detection, which decomposes a graph into multiple subgraphs, is one of the major techniques to analyze graph data. In recent years, the scalability of the community detection algorithm has been a crucial issue because of the growing size of real-world networks such as the co-author network and web graph. In this paper, we propose a scalable overlapping community detection method by using the stochastic variational Bayesian training of latent Dirichlet allocation (LDA) models, which predicts sets of neighbor nodes with a community mixture distribution. In the experiment, we show that the proposed method is much faster than previous methods and is capable of detecting communities even in a huge network that contains 60 million nodes and 1.8 billion edges. Furthermore, we compared different mini-batch sizes and the number of iterations in stochastic variational Bayesian inference to determine an empirical trade-off between efficiency and quality of overlapping community detection.
0 Replies
Loading