Live Birth Forecasting in Brazillian Health Regions with Tree-based Machine Learning Models

Douglas Vieira Do Nascimento, Rafael Teixeira Sousa, Diogo Fernandes Costa Silva, Daniel Do Prado Pagotto, Clarimar José Coelho, Arlindo Rodrigues Galvão Filho

Published: 01 Jun 2023, Last Modified: 07 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0
Abstract: This paper aims to do time series forecasting of live births in Brazil with modern tree-based machine learning models. These models are popular choices for time series forecasting due to their ability to model non-linear relationships, so they were applied to live birth forecasting with multiple covariates. The study uses data from the Brazilian Ministry of Health to train and evaluate forecasting models, following guidelines of the Ministry's expectations and needs for using forecasts for public policy planning. The study uses data from all 450 micro-regions in Brazil with records between the years 2000 and 2020. The objective is to train a tree-based model with all months between 2000 and 2018 years to assess the performance of forecasting the number of births over the years 2019 and 2020. LightGBM, XGBoost, and Catboost were evaluated and compared to AutoARIMA and simple linear regression. LightGBM performed slightly better than other models evaluated achieving a MAPE of 0.0797, with more consistent performance over the 24 months of the forecasting horizon. The results show that the tree-based models are reliable for dealing with multiple covariates and can be a useful tool for public policy planning.
Loading