Abstract: Community detection is the problem of finding naturally forming clusters in networks. It is an important problem in mining and analyzing social and other complex networks. Community detection can be used to analyze complex systems in the real world and has applications in many areas, including network science, data mining, and computational biology. Label propagation is a community detection method that is simpler and faster than other methods such as Louvain, InfoMap, and spectral-based approaches. Some real-world networks can be very large and have billions of nodes and edges. Sequential algorithms might not be suitable for dealing with such large networks. This paper presents distributed-memory and hybrid parallel community detection algorithms based on the label propagation method. We incorporated novel optimizations and communication schemes, leading to very efficient and scalable algorithms. We also discuss various load-balancing schemes and present their comparative performances. These algorithms have been implemented and evaluated using large high-performance computing systems. Our hybrid algorithm is scalable to thousands of processors and has the capability to process massive networks. This algorithm was able to detect communities in the Metaclust50 network, a massive network with 282 million nodes and 42 billion edges, in 654 s using 4096 processors.
External IDs:dblp:conf/asunam/BodduK24
Loading