# Colour Gridworld Enviroment 

This repo contains the code for the colour gridworld environment used in the experimental evaluation of the research paper "Shielding Regular Safety Properties in Reinforcement Learning"

Including an implementation of Q learning and the model checking algorithms described in the paper.

## Dependencies

The main dependencies of the project are ```gym==0.19.0```, ```numpy>=1.21.5``` and ```jax==0.4.6```. We provide a requirements.txt file, but we recommend setting up a conda environment for this project
```
conda create -n jax --clone base
```
The specific JAXLib installation you require may be different from ours as it is hardware dependent, see (https://github.com/google/jax#pip-installation-gpu-cuda) for details. Otherwise you can install the requirements with pip or similar.

```
conda activate jax
pip install -r requirements.txt
```
## Running experiments

To run experiments on colour gridworld simply run any one of the following files
```
python train_q_learning.py
python train_modified_q_learning.py
python train_shielded_q_learning.py
```
Command line arguments can be easily passed to the python files as so,
```
python train_q_learning.py --lr 0.01
```
You can train with different safety properties easily by specifying the path to the property file,
```
python train_shielded_q_learning.py --property "./properties/property_1.py"
```
You can also load your won custom properties so long as the file you provide follows the same structure as the property files, i.e. it must define a ```automaton``` and ```cost_function``` objects and either a callable ```pctl_property``` or ```product_pctl_property``` functions.

To reproduce the results in our paper simply run,
```
python run_paper_experiments.py
```
## Plotting

For plotting runs our code uses tensorboard. Simply specify the ```--logdir my_logdir``` command line argument when running experiments. Then launch tensorboard.
```
tensorboard --logdir my_logdir
```