# Zonotopic Optimizers for Shallow ReLU Networks

This repo contains the code to replicate the experiments in the *A Combinatorial Perspective on the Optimization of
Shallow ReLU Networks* NeurIPS 2022 submission.
It also provides a base for those who want to build on our work.

## Scripts

The scripts are all contained in the `scripts` directory.
They are organized into the `scripts/synthetic` and `scripts/classification` subfolders
based on the datasets used. As mentioned in the paper, there are differences in how
we trained the models for the respective datasets.

Each script will define a config data class that provides the parameters of the run.
The configs used in the paper reside in the `exps` directory. Again, they are organized into
`scripts/synthetic` and `scripts/classification` subfolders. Each of those subfolders has
files in a 1-to-1 correspondence with the scripts. Each script `$SCRIPT.py` will its configurations
in the corresponding `$SCRIPT_configs.py` file. The configs are stored in a variable called `CONFIGS`
in each of the config folders that maps config names to config instances.

All of the scripts have the following flags in common.
- `--outdir` The path to the directory to write output to. It *must* already exist. The script will write one or more json files to this directory containing the results of the runs.
- `--configs_path` The Python path (i.e. using dots instead of slashes) to a dict mapping config names to config stored. This defaults to the script's respective config file in the `exps` directory.
- `--config` Name of the entry in the configs dict to use as configuration.
- `--n_runs` Number of times to repeat the experiment.

The `scripts/classification/mgls.py` script has the following additional flags.
- `--n_processes` The number of processes to use for parallelism. Our implementation of mGLS allows us to assign neighboring vertices to different processes to solve their convex programs in parallel. We found this signficantly helps scaling.
- `--save_frequency` If left empty, then we will save results only at the end of each run. If set to an integer, then save results with a frequency of that many steps during each run. We found that the time per step significantly increases and the drop per step of the training loss significantly decreasesas training progresses. The ability to save during training was useful in practice when we launched jobs and were not sure if they would finish before they ran out of time.

### Examples

The scripts below performs 8 runs of mGLS on the MNIST 4/9 binary classification task. We use the first d=8 whitened principle components and take N=350 examples to form the training dataset. We use a model with m=8 hidden units. The results will be saved to the `$OUTDIR` directory.

```bash
OUTDIR=path/to/outdir
CONFIG=mnist49_d8_m8_N350

python3 scripts/classification/mgls.py \
    --outdir=$OUTDIR \
    --config=$CONFIG \
    --n_runs=8 \
    --n_processes=16 \
    --save_frequency=10
```

## Results
Each scripts will output one or more json files to the `--outdir`. These files will contain information about the parameters of the run along with the results of the run.
We have included some scripts that take these raw json outputs and provide a cleaner summary of the results in CSV form.
These are the `results/make_synthetic_csv.py` script for the synthetic dataset experiments and the `results/make_classification_csv.py` script for the binary classification experiments.

These scripts take the same flags with the exception that `make_synthetic_csv.py` has a `--gls_subdir` flag while `make_classification_csv` has a `--mgls_subdir`. Since these mean almost the same thing, I'll refer to them as the `--(m)gls_subdir` flag.
The set of flags taken by these scripts is as follows.
- `--data_dir` The path to directory containing the raw json results. We assume that the results from each method will be contained within a different subdirectory of this folder. Each of those subdirectories *must* only contain synthetic dataset results or binary classification results. Having both within a single subdirectory will break these scripts.
- `--(m)gls_subdir` The name of the subdirectory holding the (m)GLS results. Defaults to `"(m)gls"`.
- `--gradient_descent_subdir` The name of the subdirectory holding the gradient descent results. Defaults to `"gradient_descent"`.
- `--random_vertex_subdir` The name of the subdirectory holding the random vertex results. Defaults to `"random_vertex"`.
- `--reduction` The reduction to perform to reduce a list of values to a single value representing it. Must be either `"median"` (default) or `"mean"`.
