# UCBimitation

Download the expert data:

https://drive.google.com/drive/folders/17qHTDeAl8qZyfFJgS0Z6bP41LcHge9t3?usp=share_link

and

https://drive.google.com/drive/folders/1q1kYVXtvbEe_iKSr6f4Aw7zdcQAJpb_h?usp=share_link


***For figure 1***

To reproduce the results use the following commands:

```python train_expert/infinite_imitation.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs noisy_trajs11.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --beta 8 --noiseE 0.1```

To reproduce proximal point use

```python train_learner/ppil.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs noisy_trajs11.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1```

To use GAIL

```python train_learner/gail.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs noisy_trajs11.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1```

For AIRL

```python train_learner/gail.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs noisy_trajs11.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1 --reward-type airl```

For REIRL

```python train_learner/reirl.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs noisy_trajs11.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1```

For IQLearn

```python train_learner/iqlearn.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs noisy_trajs11.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1```

***For figure 2***


To reproduce the results use the following commands:

```python train_expert/infinite_imitation.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs16.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --beta 8 --noiseE 0.0```

```python train_expert/infinite_imitation.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs6.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --beta 8 --noiseE 0.05```


```python train_expert/infinite_imitation.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs30.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --beta 8 --noiseE 0.1```

To reproduce proximal point use

```python train_learner/ppil.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs16.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.0```

```python train_learner/ppil.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs6.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.05```

```python train_learner/ppil.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs30.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1```

To use GAIL

```python train_learner/gail.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs16.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.0```

```python train_learner/gail.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs6.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.05```

```python train_learner/gail.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs30.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1```

For AIRL

```python train_learner/gail.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs16.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.0  --reward-type airl```

```python train_learner/gail.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs6.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.05 --reward-type airl```

```python train_learner/gail.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs30.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1 --reward-type airl```

For REIRL

```python train_learner/reirl.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs16.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.0```

```python train_learner/reirl.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs6.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.05```

```python train_learner/reirl.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs30.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1```

For IQLearn

```python train_learner/iqlearn.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs16.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.0```

```python train_learner/iqlearn.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs6.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.05```

```python train_learner/iqlearn.py --env-name DiscreteGaussianGridworld-v0  --expert-trajs trajs30.pkl --num-threads 1 --max-iter-num 200 --save-model-interval 10 --grid-type 1 --eta 1 --noiseE 0.1```

All curves are averaged across 5 seeds.
The scripts `launch_experiment.py` and `launch_noisy.py` are used to run all experiments in figures 1 and 2 but requires SLURM to be installed.

