Rhea: Adaptively sampling authoritative content from social activity streams

Published: 2017, Last Modified: 19 Jan 2026IEEE BigData 2017EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Processing the full activity stream of a social network in real time is oftentimes prohibitive in terms of both storage and computational cost. One way to work around this problem is to take a sample of the social activity and use this sample to feed into applications such as content recommendation, opinion mining, or sentiment analysis. In this paper, we study the problem of extracting samples of authoritative content from a social activity stream. Specifically, we propose an adaptive stream sampling approach, termed Rhea, that processes a stream of social activity in real-time and samples the content of users that are more likely to provide influential information. To the best of our knowledge, Rhea is the first algorithm that dynamically adapts over time to account for evolving trends in the activity stream. Thus, we are able to capture high quality content from emerging users that contemporary white-list based methods ignore. We evaluate Rhea using two popular social networks reaching up to half a billion posts. Our results show that we significantly outperform previously proposed methods in terms of both recall and precision, while also offering remarkably more accurate ranking.
Loading