# TSBench

This repository contains the code for...

- ... benchmarking 13 time-series forecasting methods included in [GluonTS](https://ts.gluon.ai/)
  on 45 public datasets with respect to multiple performance measures.
- ... leveraging the collected data to train surrogate models that learn good defaults for models
  and their hyperparameters on unknown datasets.

## Code Structure

From a high level, the code is structured as follows:

- `src` contains the entire code. It is further structured in the following way:
    - `src/cli` contains all scripts that prepare data, benchmark forecasting methods, or evaluate
      surrogate models.
    - `src/tsbench` can be used as a library and provides the backbone for all scripts.
    - `src/schedule.py` can be used to schedule hyperparameter searches, i.e. enables calling
      some scripts in `src/cli` with different parameters. This script is available as `schedule`
      after installation of this project.
- `build` contains the Dockerfiles for preparing Docker containers that run experiments on
  [AWS Sagemaker](https://aws.amazon.com/sagemaker/).
- `configs` contains configuration files that define the hyperparameter searches for experiments.
  These configuration files are used by the `schedule` script.
- `notebooks` contains Jupyter notebooks that generate result tables/plots after experiments have
  been run successfully.

## EC2 Setup

The following lists multiple setup steps to configure AWS EC2 instances running Ubuntu 20.04 to run
`schedule`  and all scripts found in `src/cli`. It assumes that the project can be found at
`~/ts-bench`.

### Install R

```bash
sudo apt-get update
sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev \
    libsqlite3-dev wget curl llvm libncurses5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev \
    libffi-dev liblzma-dev libcurl4-openssl-dev r-base
sudo R -e 'install.packages(c("forecast", "nnfor"), repos="https://cloud.r-project.org")'
```

### Install Python

```bash
git clone https://github.com/pyenv/pyenv.git ~/.pyenv
cd ~/.pyenv && src/configure && make -C src
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.profile
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.profile
echo 'eval "$(pyenv init --path)"' >> ~/.profile
source ~/.profile
pyenv install 3.8.9
echo "3.8.9" > ~/.python-version
```

### Install Poetry

```bash
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
source $HOME/.poetry/env
```

### Install Docker

```bash
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
    sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo \
  "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
sudo usermod -aG docker $USER
```

### Optional: Attach Existing Data Volumes

```bash
mkdir ~/data
sudo mount /dev/nvme1n1 ~/data
# OPTIONAL: sudo chown ubuntu:ubuntu ~/data

mkdir ~/mongodb
sudo mount /dev/nvme2n1 ~/mongodb
# OPTIONAL: sudo chown mongodb:mongodb ~/mongodb

mkdir ~/results
sudo mount /dev/nvme3n1 ~/results
# OPTIONAL: sudo chown ubuntu:ubuntu ~/data
```

### Install and Configure MongoDB

```bash
wget -qO - https://www.mongodb.org/static/pgp/server-5.0.asc | sudo apt-key add -
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/5.0 multiverse" | \
    sudo tee /etc/apt/sources.list.d/mongodb-org-5.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org
sudo sed -i -e 's/dbPath:.*/dbPath: \/home\/ubuntu\/mongodb/g' /etc/mongod.conf
sudo systemctl restart mongod
```

### Install Project Dependencies

```bash
cd ~/ts-bench
poetry config virtualenvs.in-project true
poetry install
```

### Install Helpers

```bash
sudo apt-get install -y bmon cloc msttcorefonts ttf-mscorefonts-installer
```

## CLI Scripts

In the following, we briefly describe the different CLI scripts found in `src/cli` as well as the
`schedule` script.

### Local Scripts

`src/cli/local` contains scripts that ought to be run locally, typically on an AWS EC2 instances.

- `download_data.py` downloads and prepares all 45 benchmark datasets. The datasets from Kaggle are
  required to be downloaded manually though. Exceptions will tell you where to put these datasets
  such that they can be processed.
- `upload_data.py` uploads the datasets prepared via `download_data.py` to an S3 bucket to be used
  by AWS Sagemaker. Data is uploaded to `data/<dataset_name>`.
- `compute_stats.py` computes simple dataset statistics and caches them on disk to speed up other
  experiments.
- `compute_catch22.py` computes the [catch22](https://github.com/chlubba/catch22) features of all
  time series of all datasets and caches these features on file to speed up other experiments. Note
  that running this script on all datasets requires ~350 GiB of RAM due to a memory leak in
  catch22. If you don't have that much RAM available, sequentially run the script on individual
  datasets by passing the `--dataset` option. This still requires ~100 GiB of RAM for the largest
  dataset (Corporación Favorita) though.
- `pull_results.py` pulls the results from benchmarks run on AWS sagemaker. It stores all results
  in `~/data` and requires ~2.5 TiB when evaluating all 13 models on all 45 benchmark datasets.
  Running this script usually takes several hours.

### Sagemaker Scripts

`src/cli/sagemaker` contains scripts that ought to be run on AWS Sagemaker. Nonetheless, they can
be tested locally.

- `benchmark.py` runs training and evaluation (with respect to multiple performance metrics) for
  a single model on a single dataset.

### Sacred Scripts

`src/cli/sacred` contains scripts for training and evaluating surrogate models and are meant to be
tracked with [Sacred](https://github.com/IDSIA/sacred). All these scripts leverage the data that
was collected from the benchmark and run evaluation via LOOCV (iterate over all datasets used in
the benchmark, use it as test dataset and train on the remaining datasets).

- `evaluate_surrogate.py` evaluates the performance of a surrogate model with respect to different
  ranking and regression metrics.
- `evaluate_recommender.py` evaluates a recommender (that potentially uses a surrogate model) and
  logs the model choices for all test datasets.
- `evaluate_ensemble.py` evaluates the ensemble produced by a recommender and logs the ensemble's
  models along with its performance for all test datasets.

### Schedule Script

`src/schedule.py` may be used to automatically run the Sagemaker and Sacred scripts using various
parameters. The configuration files for the parameters to use can be found in the `configs` folder.

## Running Experiments

The following lists the scripts to schedule various kinds of experiments.

### Benchmark Evaluations

```bash
schedule \
    configs/benchmark \
    benchmark \
    --aws \
    --name ts-bench
```

### Evaluate Surrogates

```bash
schedule \
    configs/evaluation/surrogates.yaml \
    evaluate-surrogates
```

### Evaluate Recommenders

```bash
schedule \
    configs/evaluation/recommenders.yaml \
    evaluate-recommenders
schedule \
    configs/evaluation/recommenders_prob.yaml \
    evaluate-recommenders
```

### Evaluate Ensembles

```bash
schedule \
    configs/evaluation/ensembles.yaml \
    evaluate-ensembles
```
