A Hybrid Outbreak Detection using Ontology-based Data Collection from Social Media

Published: 2023, Last Modified: 06 Mar 2025BIBM 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Rapidly spreading diseases pose a significant threat, leading to substantial loss of life and economic devastation, as seen in the global COVID-19 outbreaks. Developing disease prediction models is crucial for preemptive pandemic control and minimizing their impact. As internet accessibility grows through computers and mobile devices, social media platforms provide a direct conduit to disseminate vital health information to the public. Unlike traditional methods that rely on bureaucratic channels, these platforms offer accurate and timely information distribution. We propose a framework that employs ontology to identify these symptoms and gather relevant tweets. Subsequently, the XGBoost-BiLSTM hybrid model harnesses this data to predict the count of infected cases. This hybrid model capitalizes on XGBoost’s prowess in handling limited dataset sizes, a prevalent challenge during outbreaks with insufficient time series data. Moreover, it enriches data for BiLSTM, amplifying its efficacy in predicting and monitoring outbreaks. To construct our dataset, we extracted tweets discussing symptoms from six distinct infectious disease outbreaks (Ebola, Zika, MERS, H1N1, Chikungunya, COVID-19) spanning from 2012 to 2021. Our results demonstrate that the proposed hybrid model outperforms nine cutting-edge and baseline models. This advancement can significantly assist health authorities in minimizing fatalities and preparing preemptively for potential outbreaks.
Loading