Reproducibility Report for Reproducibility Challenge 2021Download PDF

Anonymous

05 Feb 2022 (modified: 05 May 2023)ML Reproducibility Challenge 2021 Fall Blind SubmissionReaders: Everyone
Keywords: hierarchical time series, probabilistic forecasting, deep learning
TL;DR: A reproducibility study performed as part of Reproducibility Challenge 2021
Abstract: Reproducibility Summary Scope of Reproducibility In this work we try to reproduce the results in support of the newly proposed model HierE2E in Rangapuram et al. [2021], which represents a hierarchical model producing coherent probabilistic forecasts. This new deep learning approach is compared against 11 benchmarks on 5 datasets commonly used in the time series forecasting community. The performance of the different models is measured in terms of the Continuous Ranked Probability Score (CRPS). Methodology The authors provide a well-organized repository at gluonts-hierarchical-ICML-2021. For our experiments we extended their code. Firstly, we made minor modifications to the code written for evaluation of the classical machine learning models. Then, we also added the scripts needed for performing the hyperparameter grid search of the deep learning methods. Results The results we obtain are in alignment with 2 of the 4 central claims made in the original work. We arrive at the same conclusion as the authors that multivariate models such as DeepVAR proposed in Salinas et al. [2019] and DeepVAR+ (DeepVAR model modified specifically for analysis purposes in Rangapuram et al. [2021]) outperform the state-of-the-art in the field of hierarchical forecasting. Furthermore, we confirm their claim of HierE2E consistenly improving along the levels of a hierarchy. Nevertheless, our results do not conclusively show that HierE2E is the best model among the evaluated ones. This contrast to their findings holds for both the overall and level-wise CRPS scores. What was easy The data and code used in the experiments were adequately organized in the repository supplementing the paper. Hence, we could readily build upon authors’ work. What was difficult Due to resource limitation, we were unable to test certain hyperparameter configurations. However, this omission of configurations was done in such manner that it did not hinder the investigation of the claims made in the paper. We also came across minor difficulties when installing certain libraries for the test environment proposed by the authors, because our operating system distributions most likely did not match theirs, which was not specified in the repository accompanying the paper. Communication with original authors Before performing the tests, we contacted the corresponding author of the paper to obtain the runtimes of the models. This information was required for allocation of our compute resources. We did not get a response.
Paper Url: https://proceedings.mlr.press/v139/rangapuram21a.html
Paper Venue: ICML 2021
Supplementary Material: zip
3 Replies

Loading