- Keywords: node embedding, community detection, biased random walks
- TL;DR: A community preserving node embedding algorithm that results in more effective detection of communities with a clustering on the embedded space
- Abstract: Detecting communities or the modular structure of real-life networks (e.g. a social network or a product purchase network) is an important task because the way a network functions is often determined by its communities. The traditional approaches to community detection involve modularity-based approaches, which generally speaking, construct partitions based on heuristics that seek to maximize the ratio of the edges within the partitions to those between them. Node embedding approaches, which represent each node in a graph as a real-valued vector, transform the problem of community detection in a graph to that of clustering a set of vectors. Existing node embedding approaches are primarily based on first initiating uniform random walks from each node to construct a context of a node and then seeks to make the vector representation of the node close to its context. However, standard node embedding approaches do not directly take into account the community structure of a network while constructing the context around each node. To alleviate this, we explore two different threads of work. First, we investigate the use of biased random walks (specifically, maximum entropy based walks) to obtain more centrality preserving embedding of nodes, which we hypothesize may lead to more effective clusters in the embedded space. Second, we propose a community structure aware node embedding approach where we incorporate modularity-based partitioning heuristics into the objective function of node embedding. We demonstrate that our proposed approach for community detection outperforms a number of modularity-based baselines as well as K-means on a standard node-embedded vector space (specifically, node2vec) on a wide range of real-life networks of different sizes and densities.