# Models

This module contains the implementation of each of the six Autoencoder architectures as well as scripts related to training, data processing and inference as well as a CLI tool for hyperparamerter tuning. 

- **architectures:** Contains a folder for each model architecture with the model implementation (`model.py`) a trainer class (`training.py`) and a config file (`config.json`) which documents tunable and non-tunable hyperparameters.
- **data_utils:** This folder holds code related to processing and loading of the different datasets used in the experiments.
- **inference:** The inference module contains code used for inference on the processed datasets and the generated embeddings for downstream benchmarking. 
- **train_utils:** Contains the general training loop used in for benchmarks with all models (`train.py`) and the script with a CLI interface for hyperparameter tuning (`tune.py`) which is documented [below](#hyperparameter-tuning).

## Autoencoder Architectures
| Name              | Description                                  |
|-------------------|----------------------------------------------|
| [ConvAE](architectures/ConvAE/model.py)        | Convolutional Autoencoder Network        |
| [DeepCAE](architectures/DeepCAE/model.py)       | Deep Contractive Autoencoder             |
| [JointVAE](architectures/JointVAE/model.py)      | Joint Variational Autoencoder            |
| [KernelPCA](architectures/PCA/model.py)           | Kernel Principle Component Analysis (baseline)  |
| [StandardAE](architectures/StandardAE/model.py)    | Simple Multilayer Perceptron Autoencoder |
| [TransformerAE](architectures/TransformerAE/model.py) | Transformer based Autoencoder            |

## Hyperparameter Tuning

Hyperparameter tuning is called as a CLI tool which starts a hyperparameter tuning job for each combination of model-name, dataset-name and latent-ratio. By default, learning rate is the only parameter which is currently tuned.
It takes the following arguments:

| Argument     | Description                                                                                                                 | Options                                                                                                     | Default |
|--------------|-----------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------|---------|
| model-name   | Takes a space separated list of model names                                                                                 | StandardAE, DeepCAE, JointVAE, PCA, TransformerAE, ConvAE.                                                       | None    |
| dataset-name | Takes a space separated list of dataset names                                                                               | Adult, BankMarketing, BlastChar, CaliforniaHousing, ChurnModelling, Shoppers, Students, Support2, TeaRetail | None    |
| latent-ratio | Takes a space separated list of latent-ratio values. These determine the ratio between input dimension and latent dimension | $$\text{Number} \in (0, 1]$$                                                                                              | 0.505   |
| n-workers    | Number of workers on which tuning jobs can be started (either cpus or cuda devices)                                         |  $$\text{Number} \in \N^+$$                                                                                                 | 8       |
| max-time     | Maximum Wallclock time per one tuning job (model, dataset, ratio combination)                                               |  $$\text{Number} \in \N^+$$                                                                                                  | 120     |

## Examples:

Run a hyperparameter tuning for a single model (e.g. StandardAE) and a single dataset (e.g. TeaRetail) with the standard 0.505 compression ratio. 
```
python -m models.train_utils.tune \
    --model-name StandardAE \
    --dataset-name TeaRetail
```
Start tuning of all models on all datasets for three different compression ratios. This was run on a p3.16xlarge instance (8 GPUs with 16GB of memory each). 
```
python -m models.train_utils.tune \
    --model-name \
        ConvAE \
        DeepCAE \
        JointVAE \
        StandardAE \
        TransformerAE \
    --dataset-name \
        Adult \
        BankMarketing \
        BlastChar \
        CaliforniaHousing \
        Shoppers \
        Students \
        Support2 \
        TeaRetail \
        Abalone \
        Parkinsons \
        Thermography \
        Walmart \
        AirQuality \
    --max-time 3600 \
    --n-workers 8 \
    --latent-ratio 0.505
```

The hyperparameter tuning for the ChurnModelling dataset was run on a g5.2xlarge instance on a single GPU with 20+GB of memory.
```
python -m models.train_utils.tune \
    --model-name \
        ConvAE \
        DeepCAE \
        JointVAE \
        StandardAE \
        TransformerAE \
    --dataset-name \
        ChurnModelling \
    --max-time 3600 \
    --n-workers 1 \
    --latent-ratio 0.505
```
