A Study of the Effectiveness of Correction Factors for Log Transforms in Ensemble ModelsOpen Website

2022 (modified: 19 Jan 2023)ADMA (2) 2022Readers: Everyone
Abstract: We consider the problem of making predictions about long-tailed interval variables. Such variables are commonplace in revenue prediction, where a small part of the population has very large positive or negative values. Log transforms are often used in such problems, and when modeled in a log space, a correction factor is required when converting back to the original space. In this work, we study the effectiveness of two different approaches of applying correction factors at the individual model level and the whole model level. Particularly, we consider ensembles of simpler models (decision trees) with individual correction factors, compared with XGBoost, an averaging model using an overall correction factor. We show that the ensembles of simple models outperform XGBoost, when the correction factors are applied separately to each component model.
0 Replies

Loading