﻿# SCALE

SCALE is a two-stage project for uncertainty-aware spatiotemporal forecasting:
- Stage 1: Train a base forecasting model (RNN / STGNN / Transformer / DCRNN / GraphWaveNet / AGCRN / VAR, etc.) and save residuals plus indices.
- Stage 2: Use the SCALE quantile model to apply spectral graph wavelet (SGWT) low/high-frequency decoupling, combining a low-frequency non-exchangeable backbone, high-frequency exchangeable statistics, and gating to output multi-quantile prediction intervals.

The project uses PyTorch Lightning and Torch Spatiotemporal (tsl) for training and data pipelines. All outputs are written to `logs/`.

## Repository structure
- `basicts/`: shared components (predictors, metrics, residual utilities, logging).
- `conformal_model/scale/`: SCALE model implementation and configs (SGWT, decoupled components, backbones, losses).
- `foundation_model/`: base-model training configs (Hydra-style YAML).
- `experiments/`: experiment entry points (base training, SCALE training, config runner).
- `datasets/`: datasets directory (includes MetrLA, PeMS04/07/08 samples).
- `logs/`: training logs and artifacts.
- `config.yaml`: global path configuration (data, logs, etc.).
- `install.txt`: dependency install hints.

## Environment and dependencies
Core deps (from imports in code):
- PyTorch + PyTorch Lightning
- torch-geometric and its extensions (`pyg_lib`, `torch_scatter`, `torch_sparse`, `torch_cluster`, `torch_spline_conv`)
- torch-spatiotemporal (`tsl`)
- omegaconf, hydra-core
- tensorboard / tensorboardX

Install from `install.txt` (make sure CUDA matches your local PyTorch build):
```bash
pip install torch_geometric
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.8.0+cu129.html
pip install torch-spatiotemporal
pip install omegaconf
pip install hydra-core
pip install tensorboard
pip install tensorboardX
```

## Quick start
### 1) Train a base model and generate residuals
Default entry point:
```bash
python experiments/run_base_model.py
```
This reads configs from `foundation_model/training/` (default `default.yaml`) and saves residuals:
- `logs/base/<dataset>/<model>/<date>/<time>/residuals.h5`
- `logs/base/<dataset>/<model>/<date>/<time>/indices.npz`
- `logs/base/<dataset>/<model>/<date>/<time>/config.yaml`

To change model/dataset, edit YAML files under `foundation_model/training/` (e.g., `dataset/*.yaml`, `model/*.yaml`).

### 2) Run the SCALE quantile model
SCALE configs are Python files with a `CONFIG` dict. Use the config runner:
1) Open `conformal_model/scale/config/<dataset>.py` and set `src_dir` to the Stage-1 output directory.
2) Run:
```bash
python experiments/run_config.py --config conformal_model/scale/config/la.py
```
Outputs go to:
- `logs/scale/<dataset>/scale/<date>/<time>/`

This folder includes `metrics.json`, `runner.log`, Lightning checkpoints, and TensorBoard logs.

## Configuration notes
### Base model (Stage-1)
Entry: `foundation_model/training/default.yaml`
- `dataset`: dataset name and split strategy (`splitting`)
- `model`: model configuration (RNN/STGNN/Transformer/DCRNN/GWNet/AGCRN, etc.)
- `optimizer` / `lr` / `weight_decay`: optimizer setup
- `window` / `horizon` / `stride` / `delay`: temporal windowing
- `apply_scaler` / `scale_axis`: normalization settings

### SCALE (Stage-2)
Entry: `conformal_model/scale/config/*.py`
Key fields:
- `src_dir`: Stage-1 output directory (required)
- `alphas`: target miscoverage, e.g. `[0.05, 0.1, 0.2]`
- `window` / `horizon`: must match Stage-1 (strict check)
- `val_len`: validation split ratio for calibration data
- `apply_scaler` / `scale_axis`: normalization settings
- `dataset.connectivity`: graph construction
- `model.hparams.model`: SGWT and backbone params (`n_scales`, `kernel_type`, `n_high_scales`, `enable_gating`, etc.)

## Datasets
Supported datasets (see `experiments/run_base_model.py` and `experiments/run_scale.py`):
- `la` (MetrLA)
- `pems03`, `pems04`, `pems07`, `pems08`
- `pems_bay`
- `large_st`

Data is expected under `datasets/`. If you use a `tsl` dataset that is not present locally, the first run may trigger a download.

## Outputs and evaluation
SCALE computes and logs multi-quantile metrics:
- Pinball Loss
- Coverage / Delta Coverage
- Prediction Interval Width (PIW)
- Winkler Score

Results are saved in `metrics.json`, with per-alpha summaries under `metrics.per_alpha`.

## FAQ
- `src_dir` is empty: ensure SCALE config points to a directory containing `residuals.h5`, `indices.npz`, and `config.yaml`.
- `horizon` mismatch: Stage-2 enforces the same `horizon` as Stage-1.
- GPU/CPU: scripts auto-select GPU if available, otherwise CPU; CPU training is slower.

## Entry points
- Base model training: `experiments/run_base_model.py`
- SCALE training: `experiments/run_scale.py`
- Python config runner: `experiments/run_config.py`

For adding new models, changing backbones, or modifying SGWT details, start with:
- `conformal_model/scale/arch/spec_decoupled_components.py`
- `conformal_model/scale/arch/backbones/`
- `conformal_model/scale/arch/spectral/`
