Abstract: A huge amount of data, known as "big data," has been generated from various areas. A network is a popular data structure for presenting and analyzing big data. However, the conventional network analysis algorithms cannot cover the size of big data. To address this limitation, we propose in this paper a network clustering algorithm for a big data network using a parallel distributed computation model. To consider parallel computation concepts, we change the paradigm of the conventional clustering algorithm using triangle structures. We demonstrate that the proposed algorithm can cover a big data network that cannot be otherwise implemented using a conventional algorithm. Experimental results show that the proposed algorithm is faster than the conventional algorithm.
Loading