Abstract: In this paper we propose a method to characterize user behavior from their engagement with enterprise social media. Content analysis often suffers challenges due to noise. Here we study behavior using temporal activity, i.e., the number of posts per month represented as a time series. User posting volume on social media has a long tailed nature. It causes time series clustering algorithms to result in unbalanced clusters with either very few users or almost all users. Thus we propose a hierarchical time series clustering algorithm to group users according to their behavioral homogeneity and provide interpretable characterizations to the resulting clusters. Users in distinct clusters deviate significantly in their topics of interest while being homophilic (near identical or similar minded) within the cluster. Goodness of the clustering is observed over Enterprise Social Media (ESM); Stackexchange; and Linux Kernel Mailing List (LKML) datasets as opposed to existing clustering techniques.
0 Replies
Loading