# Subtrajectory Balance Grid Experiment Code
This is the code for running grid experiments on subtrajectory balance for 
GFlowNets.  The rest of the README will describe installation and running
of the code.

## Installation
First let's create the virtualenv.

```
$ mkdir ~/venvs
$ cd ~/venvs
$ virtualenv <venv_name>
$ echo 'export WANDB_PROJECT="gflownet_subtb"' >> <venv_name>/bin/activate
$ source <venv_name>/bin/activate
```

Next we'll install the packages required for the repository.  Note that we install
`torch` outside of `requirements.txt` as it must be installed from a url.

```
$ cd ~/repos/gflownet_based_exploration
$ pip install -r requirements.txt
$ pip install torch==1.9.0+cu102 -f https://download.pytorch.org/whl/torch_stable.html
$ pip install -e .
```

The repository should now be installed!  In order for wandb to work properly you must also
do a couple things for setup.  First, you should put a key file somewhere on your filesystem
and add a line to your shell's rc file of

```
$ export WANDB_API_KEY_FILE=<path_to_your_wandb_key>
```

If you want to log to a specific project, you can also set an environment variable called
`WANDB_PROJECT` in the same manner as above and set the environment variable to the name
of the proper project.

To verify your installation is working properly, you can do the following and ensure
that logs are appearing in wandb.

```
$ cd gfn_subtb_grid
$ python launch_experiment.py -c subtb_grad_bias_var
```

## Usage
All experiments should be started by running `python launch_experiment.py` with some
set of arguments.  This repository makes heavy use of [Ray Tune](https://docs.ray.io/en/latest/tune/index.html)
to run experiments.  When the `launch_experiment.py` script is invoked it reads an
experiment configuration from your file system as specified by you.  The configuration
is a dictionary assigned to a global variable named `CONFIG` in a configuration
python file.  For examples please see the example configuration files in the 
`configs` directory.  After reading the configuration, `launch_experiment.py`
starts the trial using Ray, doing hyperparameter tuning if desired or simply
running a single trial if not.  Ray will run the experiment for you and automatically
log results to wandb.  Further, Ray can run multiple experiments in parallel if your 
allocation has multiple GPUs.  A typical use case is running 8 parallel experiments 
on an allocation with 4 GPUs (running two trials per GPU) in a hyperparameter search
for a new setting, or simply running many experiments in parallel across different 
random seeds.

There are five options which can be passed to the `launch_experiment.py` script.
The first and most important is the `-c` or `--config_name` parameter, for which you
should pass the name of the python file which has the configuration you would like to run.
Note that this should not include the python script's suffix `.py`.  For example, if your
configuration file is stored at `configs/untracked_configs/my_config.py` you should run
the `launch_experiment.py` script as `python launch_experiment.py -c my_config`, excluding
the parent parts of the path and the `.py` suffix.  If you include these other parts the
config will not be found.  Further, your configuration dictionary MUST be stored in a global
variable named `CONFIG` or it will not be found.  If you look through the documentation
on Ray Tune you will see things about hyperparameter schedulers and search algorithms.
To use one of these in your run you should assign your preferred searcher or scheduler
to the `SEARCH_ALGORITHM` and `SCHEDULER_ALGORITHM` global variables, respectively.  Finally,
this repository implements the base class `ray.tune.Trainable` in some "driver" classes
which have the main logic for training in them.  To select the trainer you would like to use
you should assign the trainer to the `TRAINER` global variable in your configuration file.

The other command line options are less pressing.  `-n` or `--num_training_iterations` is
an integer valued argument which determines how many times we will loop the training procedure.
More precisely, this is the number of times that `ray` will call the `ray.tune.Trainable.step()`
method on the trainer object you are using.  `-w` or `--wandb_log_dir` tells the launcher
where to log your wandb results to.  If this option is not passed the launcher will default to 
storing them under `~/scratch/wandb/gflownet_exploration/` and create the directory if needed.
The `-e` or `--experiment_name` option gives you a way to set the "Group" tag that the runs
for this configuration will be set to in wandb.  If not supplied this will default to your config 
name.  Finally, the `-t` or `--entity_name` option allows you to set the entity you would
like to log to in wandb.  If not supplied this will default to wandb's default value.

## Configs
`configs/subtb_grad_bias_var.py` contains the config for gradient variance and bias experiments.

