Predicting Prevalence of Influenza-Like Illness From Geo-Tagged TweetsOpen Website

2017 (modified: 12 Nov 2022)WWW (Companion Volume) 2017Readers: Everyone
Abstract: Modeling disease spread and distribution using social media data has become an increasingly popular research area. While Twitter data has recently been investigated for estimating disease spread, the extent to which it is representative of disease spread and distribution in a macro perspective is still an open question. In this paper, we focus on macro-scale modeling of influenza-like illnesses (ILI) using a large dataset containing 8,961,932 tweets from Australia collected in 2015. We first propose modifications of the state-of-the-art ILI-related tweet detection approaches to acquire a more refined dataset. We normalize the number of detected ILI-related tweets with Internet access and Twitter penetration rates in each state. Then, we establish a state-level linear regression model between the number of ILI-related tweets and the number of real influenza notifications. The Pearson correlation coefficient of the model is 0.93. Our results indicate that: 1) a strong positive linear correlation exists between the number of ILI-related tweets and the number of recorded influenza notifications at state scale; 2) Twitter data has promising ability in helping detect influenza outbreaks; 3) taking into account the population, Internet access and Twitter penetration rates in each state enhances the prevalence modeling analysis.
0 Replies

Loading