Track: Social networks and social media
Keywords: information diffusion, hashtags, social networks, social identity, Twitter
TL;DR: We investigate the role of the topology of Twitter's social network and the identity of users in the adoption of hashtags–and heterogeneity in when hashtag cascades are best modeled by network + identity together vs. by network or identity alone.
Abstract: The diffusion of culture online (e.g., hashtags) is theorized to be influenced by many interacting social factors (e.g., network _and_ identity). However, most existing computational cascade models model just a single factor (e.g., network _or_ identity). This work offers a new framework for teasing apart the mechanisms underlying hashtag cascades. We curate a new dataset of 1,337 hashtags representing cultural innovation online, develop a 10-factor evaluation framework for comparing empirical and synthetic cascades, and show that a combined network+identity model performs better than a network- or identity-only counterfactual. We also explore the heterogeneity in this result: While a combined network+identity model best predicts the popularity of cascades, a network-only model has better performance in predicting cascade growth and an identity-only model in adopter composition. The network+identity model most strongly outperforms the counterfactuals among hashtags used for expressing racial or regional identity and talking about sports or news. In fact, we are able to predict what combination of network and/or identity best models each hashtag and use this to further improve performance. In sum, our results imply the utility of multi-factor models in predicting cascades, in order to account for the varied ways in which network, identity, and other social factors play a role in the diffusion of hashtags on Twitter.
Submission Number: 2225
Loading