# Code and data for tree-structured reward learning

This folder contains all code (apart from the standard library dependencies in `requirements.txt`) required to run reward learning on the aircraft handling environment with both tree and neural network reward models.

## scripts/run.py

Runs the online reward learning algorithm, configured by a number of arguments.

- Run for 200 episodes with a tree-structured reward model (0-1 splitting) and a PETS agent on the **Follow** task:

```
python scripts/run.py follow pets 200 --model=tree_0-1
```

- As above, but with logging to Weights & Biases enabled (requires the `wandb` library to be installed and configured):

```
python scripts/run.py follow pets 200 --model=tree_0-1 --do_wandb=1
```

- Change the task to **Chase** and the reward model to a neural network, and only collect 500 preferences (default = 1000):

````
python scripts/run.py chase pets 200 --model=net --schedule=500_200_1_1
````

- Change the task to **Land** and the reward model to a tree with variance-based splitting, and add oracle noise:

````
python scripts/run.py land pets 200 --model=tree_var --irrationality=beta_land=2
````

- Change the agent to SAC, and run for 50000 episodes (warning: long runtime):

````
python scripts/run.py follow sac 50000 --model=tree_0-1 --schedule=1k_50k_250_250
````

## scripts/enjoy.py

Loads trained reward models or oracles and deploys PETS agents using them, with rendering enabled.

-  Deploy using the tree model (0-1 splitting) for the **Follow** task:

```
python scripts/enjoy.py follow tree_0-1
```

- Deploy using the neural network model for the **Chase** task:

```
python scripts/enjoy.py chase net
```

- Deploy using the oracle reward function for the **Land** task:

```
python scripts/enjoy.py land oracle
```

## scripts/show_tree.py

Visualises trained tree-structured reward models as diagrams.

- Show the tree model (0-1 splitting) for the **Chase** task, displaying with Matplotlib:

```
python scripts/show_tree.py chase tree_0-1
```

- Show the tree model (variance-based splitting) for the **Land** task, saving out as an `.svg` file:

```
python scripts/show_tree.py land tree_var --svg=1
```