# Active exploration for inverse reinforcement learning

This is code accompanying the paper: "Active Exploration for Inverse Reinforcement Learning" Here, we describe how to reproduce the experiments presented in the paper.

## Setup

We recommend to use [Anaconda](https://www.anaconda.com/) to set up an environment with the dependencies of this repository. If Anaconda is installed, you can create an environment using:
```
conda create -n "active-irl" python=3.9
conda activate active-irl
```

Then, install dependencies by running
```bash
pip install -e . 
```
in the root folder of the code (where this README is located).


## Reproducing experiments in the main paper

All experiments can be run using the `scripts/experiments/experiments.py` script. We use [`sacred`](https://github.com/IDSIA/sacred) for keeping track of experiment parameters.

### Running active IRL experiments

Run the following commands to reproduce the experiments in the main paper:
```
python scripts/experiments/experiment.py with four_paths n_ep_per_iter=50 results_file="result_aceirl_four_paths_50ep_50runs.csv"
python scripts/experiments/experiment.py with four_paths n_ep_per_iter=100 results_file="result_aceirl_four_paths_100ep_50runs.csv"
python scripts/experiments/experiment.py with four_paths n_ep_per_iter=200 results_file="result_aceirl_four_paths_200ep_50runs.csv"

python scripts/experiments/experiment.py with double_chain n_ep_per_iter=50 results_file="result_aceirl_double_chain_50ep_50runs.csv"
python scripts/experiments/experiment.py with double_chain n_ep_per_iter=100 results_file="result_aceirl_double_chain_100ep_50runs.csv"
python scripts/experiments/experiment.py with double_chain n_ep_per_iter=200 results_file="result_aceirl_double_chain_200ep_50runs.csv"

python scripts/experiments/experiment.py with random_env results_file="result_aceirl_random_mdp_50runs.csv"
python scripts/experiments/experiment.py with chain results_file="result_aceirl_chain_mdp_50runs.csv"
python scripts/experiments/experiment.py with gridworld results_file="result_aceirl_gridworld_50runs.csv"
```

To parallelize experiments, you can additionally pass the `n_jobs` parameter, for example:
```
python scripts/experiments/experiment.py with four_paths n_ep_per_iter=200 n_jobs=50
```

### Running reward-free exploration experiments in the appendix

Run the following commands to reproduce the reward-free exploration experiments:
```
python scripts/experiments/experiment.py with double_chain_rfe n_ep_per_iter=1000
python scripts/experiments/experiment.py with double_chain_rfe n_ep_per_iter=3000
python scripts/experiments/experiment.py with double_chain_rfe n_ep_per_iter=5000
python scripts/experiments/experiment.py with four_paths_rfe
```

### Creating plots

The results are saved in `results/`. To plot the results and produce the results in Table 1, use the `scripts/experiments/plot_results.ipynb` notebook.
