# PAE: REINFORCEMENT LEARNING FROM EXTERNAL KNOWLEDGE FOR EFFICIENT EXPLORATION

This is the codebase accompanying the ICLR submission, PAE: Reinforcement Learning from External Knowledge for Efficient Exploration

Note to reviewers:
1. The codebase will be further cleaned-up in time for camera-ready submission and released on GitHub.
2. There are institutional affiliations in files in the `subgoals/minigrid` and `subgoals/babyai` folders, but this is *not* revealing author identity: these are just clones of existing open-source repositories (with slight modifications) available at: https://github.com/maximecb/gym-minigrid https://github.com/mila-iqia/babyai/ . Our original code is in `PAE/` and is fully anonymized.

# Setup
Let's start with python 3.9. It's recommend to create a `conda` env:

## Create new conda env 
```
conda create -n PAE python=3.9
conda activate PAE
```

## Prepare for setuptools and wheels
```
pip install setuptools==63.2.0
pip install wheel==0.38.4
pip install protobuf==3.20
```

## Install
```
pip install -r requirements.txt
```

Note you may have to follow https://pytorch.org/ setup instructions for installation on your own machine.

# Running Experiments
```
cd knowledge-instructed-RL
./scripts/Run-PAE.sh
```
Use the scripts `scripts/Run-PAE.sh`, which contain preconfigured arguments and hyperparameters for all experiments in the paper.  Experiments use Hydra to manage configuration.

Visualizing results requires `wandb`: configure project name with the `project` key in `PAE/conf/config.yaml`.

As an example of how to run, this runs PAE on the key3 environment:

```
OMP_NUM_THREADS=1 python -m PAE.train -m +experiment="key3" group="ICLR-key3"
```

`OMP_NUM_THREADS=1` is essential to prevent CPU ops from hanging.

`data/babyai` contains the sentence embedding of knowledge using BERT-base, which is detailly presented in our main paper and appendix.

The environment is selected with the `+experiment=` flag, each of which corresponds to a YAML file in `PAE/conf/experiment/`. See that folder
for the list of available experiments.

Other hyperparameters are explained in `PAE/conf/config.yaml` and our paper.
# Questions?

For any questions, please reach out to authors in the rebuttal phase!
