Hierarchical Clustering and Multivariate Forecasting for Health EconometricsDownload PDF

Published: 03 Jul 2023, Last Modified: 27 Jul 2023KDD 2023 Workshop epiDAMIKReaders: Everyone
Keywords: Clustering, forecasting, health econometrics, data science
TL;DR: This study applies machine learning methods to forecast the long-term impact of socio-economic changes on health indicators, utilizing Hierarchical Cluster Analysis to group countries and employing Multivariate Prophet model for time series analysis.
Abstract: Data science approaches in Health Econometrics and Public Health research are limited, with a lack of exploration of state-of-the-art computational methods. Recent studies have shown that neural networks and machine learning methods outperform traditional statistical methods in forecasting and time-series analysis. In this study, we demonstrate the use of unsupervised and supervised machine learning approaches to create "what-if" scenarios for forecasting the long-term impact of changes in socio-economic indicators on health indicators. These indicators include basic sanitation services, immunization, population ages, life expectancy, and domestic health expenditure. To begin, we utilized Hierarchical Cluster Analysis to group 131 countries into 9 clusters based on various indicators from the World Bank Health Statistics and Nutrition dataset. This step allowed us to create clusters of countries. In order to showcase the feasibility of our approach, we performed a time series analysis using multivariate prophet on the most significant features from a cluster consisting of Bahrain, Kuwait, Oman, Qatar, and Saudi Arabia. The study developed robust models (𝑅2 = 0.93+) capable of forecasting 11 health indicators up to 10 years into the future. By employing these "what-if" scenarios and forecasting models, policymakers and healthcare practitioners can make informed decisions and effectively implement targeted interventions to address health-related challenges.
3 Replies

Loading