# Feature-aligned N-BEATS with Sinkhorn divergence

## Data

Data should have form of `data/<SUPERDOMAIN>/<DOMAIN>.csv`, with three columns.

- `time` denotes the time index
- `series` denotes the series index
- `value` denotes the value of the time series at the time index

### Source

You can download datasets from [here](https://drive.google.com/file/d/1TS8GJNG3daULWnfka5Iz0XSviIHiMnXT).

- [FRED](https://fred.stlouisfed.org)
  - [Commodities](https://fred.stlouisfed.org/categories/32217) category
  - [National Income & Product Accounts](https://fred.stlouisfed.org/categories/18) category
  - [Interest Rates](https://fred.stlouisfed.org/categories/22) category
  - [Exchange Rates](https://fred.stlouisfed.org/categories/15) category
- [NCEI](https://ncei.noaa.gov): 2020s from [Global Surface Summary of the Day - GSOD](https://ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc:C00516/html), [Global Summary of the Month (GSOM), Version 1.0.3](https://ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc:C00946/html), and [Global Summary of the Year (GSOY), Version 1](https://ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc:C00947/html)

  - `"TEMP", "STP", "WDSP", "PRCP"` columns
  - `"TAVG", "AWND", "PRCP"` columns

## Usage

```shell
python main.py --source_domains $SOURCE_DOMAIN1 $SOURCE_DOMAIN2 ... \
               --target_domain $TARGET_DOMAIN \
               --forecast_horizon $FORECAST_HORIZON \
               --lookback_multiple $LOOKBACK_MULTIPLE \
               --model $MODEL \
               --loss $LOSS \
               --regularizer $REGULARIZER \
               --temperature $TEMPERATURE \
               --reduce_type $REDUCE_TYPE \
               --scaler $SCALER \
               --metric $METRIC \
               --learning_rate $LEARNING_RATE \
               --num_lr_cycles $NUM_LR_CYCLES \
               --batch_size $BATCH_SIZE \
               --num_iters $NUM_ITERS \
               --seed $SEED \
               --dtype $DTYPE \
               --data_size $DATA_SIZE
```

The detailed descriptions about the arguments are as follows:
| Argument            | Description                                                                                                                                                                                       | Default      |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------ |
| `source_domains`    | Source domains $\{\mathcal{D}^k\}_k$                                                                                                                                                              |              |
| `target_domain`     | Target domain $\mathcal{D}^T$                                                                                                                                                                     |              |
| `forecast_horizon`  | Forecast horizon $\alpha$                                                                                                                                                                         | `10`         |
| `lookback_multiple` | Lookback multiple $\beta/\alpha$                                                                                                                                                                  | `5`          |
| `model`             | Model architecture $\mathfrak{F}$                                                                                                                                                                 | `"NHiTS"`    |
| `loss`              | Loss function $\mathcal{L}$                                                                                                                                                                       | `"SMAPE"`    |
| `regularizer`       | Regularizer measure $\mathcal{L}_\mathrm{align}$ <br> NOTE: `"None"` for vanilla model                                                                                                            | `"Sinkhorn"` |
| `temperature`       | Temperature $\lambda$                                                                                                                                                                             | `1.0`        |
| `reduce_type`       | Type of reduction for stack-wise feature alignment                                                                                                                                                | `"max"`      |
| `scaler`            | Scaling function $\sigma$                                                                                                                                                                         | `"softmax"`  |
| `metric`            | Metric for validation and test                                                                                                                                                                    | `"SMAPE"`    |
| `learning_rate`     | Learning rate $\eta$                                                                                                                                                                              | `2e-5`       |
| `num_lr_cycles`     | Number of learning rate cycles<br>NOTE: `torch.optim.lr_scheduler.CyclicLR(mode="triangular2")` is used ([ref](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CyclicLR.html)) | `50`         |
| `batch_size`        | Batch size $B$                                                                                                                                                                                    | `2**12`      |
| `num_iters`         | Number of iterations                                                                                                                                                                              | `1000`       |
| `seed`              | Random seed                                                                                                                                                                                       | `0`          |
| `dtype`             | Data type used for `torch` and `numpy`                                                                                                                                                            | `"float32"`  |
| `data_size`         | Fixed data size for each domain <br> NOTE: Set this `"None"` to use all data                                                                                                                      | `75000`      |
