Finding Community Topics and Membership in Graphs

Matt Revelle, Carlotta Domeniconi, Mackenzie Sweeney, Aditya Johri

Published: 10 Sept 2015, Last Modified: 15 Apr 2025European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in DatabasesEveryoneCC BY 4.0

Abstract: Community detection in networks is a broad problem with many proposed solutions. Existing methods frequently make use of edge density and node attributes; however, the methods ultimately have different definitions of community and build strong assumptions about community features into their models. We propose a new method for community detection, which estimates both per-community feature distributions (topics) and per-node community membership. Communities are modeled as connected subgraphs with nodes sharing similar attributes. Nodes may join multiple communities and share common attributes with each. Communities have an associated probability distribution over attributes and node attributes are modeled as draws from a mixture distribution. We make two basic assumptions about community structure: communities are densely connected and have a small network diameter. These assumptions inform the estimation of community topics and membership assignments without being too prescriptive. We present competitive results against state-of-the-art methods for finding communities in networks constructed from NSF awards, the DBLP repository, and the Scratch online community