# Few-shot Inverse RL Project

We have deleted the `envrionment` folder due to the size issue for uploading this as the supplimentary file, here is the [link](https://drive.google.com/drive/folders/1JbN2GX0005qrPSvRD3IEASOa9o5S3SMs?usp=sharing) to download the full code.

Code based off of Youngwoon's robot learning repo: https://github.com/youngwoon/robot-learning/tree/master
 
## Demos
Here is the [link](https://drive.google.com/drive/folders/1JbN2GX0005qrPSvRD3IEASOa9o5S3SMs?usp=sharing) to our data: please create a folder named "demos" under this project, and download the file "demos" and put these data under this folder

## Run IL algorithms with Demonstrations

### BC
please run BC to do policy pretraining before running any below algorithms.
```bash
# train BC on target demos
$ python -m run --run_prefix test --algo bc --env maze2d-large-blue-v3 --max_global_step 300 --num_target_demos 2
```
### GAIL
```bash
$ python -m run --run_prefix test --algo gail --env maze2d-large-blue-v3 --num_target_demos 2 --init_ckpt_path log/maze2d-large-blue-v3.bc.test.123/ckpt_000000300.pt --pretrain_discriminator 2
```

### SQIL
``` bash
$ python -m run --run_prefix test --algo sqil --env maze2d-large-blue-v3 --init_ckpt_path log/maze2d-large-blue-v3.bc.test.123/ckpt_000000300.pt --num_target_demos 2 --other_task_data_proportion 0.2 --max_global_step 1000000
```

### DVD
```
$ python -m run --run_prefix test --algo gail-v2 --env maze2d-large-blue-v3 --init_ckpt_path log/maze2d-large-blue-v3.bc.test.123/ckpt_000000300.pt --num_target_demos 2 --pretrain_discriminator 2 --pre_dvd_step 200 --is_frozen True
```

### MPIRL (Ours)
```bash
$ python -m run --run_prefix test --algo reachable_gail --env maze2d-large-blue-v3 --init_ckpt_path log/maze2d-large-blue-v3.bc.test.123/ckpt_000000300.pt --num_target_demos 2 --pretrain_n_epochs 5 --dense_reward_scale 0.001 --is_DVD_relabel True
```
## Evironments
Here are the environmemts we use in the paper: 	`--env` can be replaced with the followings 
```
- stack-blue-magenta-v0
- plate-slide-back-v2
- lever-pull-v2
- drawer-open-v2
- door-lock-v2
- door-open-v2
- button-press-wall-v2
- door-unlock-v2
- maze2d-large-blue-v3
```

## Directories
* `run.py`: simply launches `main.py`
* `main.py`: sets up experiment and runs training using `trainer.py`
* `trainer.py`: contains training and evaluation code
* `algorithms/`: implementation of all RL and IL algorithms
* `config/`: hyper-parameters in `config/__init__.py`
* `environments/`: registers environments (OpenAI Gym and Deepmind Control Suite)
* `networks/`: implementation of networks, such as policy and value function
* `utils/`: contains helper functions


## Prerequisites
* Ubuntu 18.04 or above
* Python 3.6
* Mujoco 2.0


## Installation

1. Install mujoco 2.0 and add the following environment variables into `~/.bashrc` or `~/.zshrc`
```bash
# download mujoco 2.0
$ wget https://www.roboti.us/download/mujoco200_linux.zip -O mujoco.zip
$ unzip mujoco.zip -d ~/.mujoco
$ cp -r ~/.mujoco/mujoco200_linux ~/.mujoco/mujoco200

# copy mujoco license key `mjkey.txt` to `~/.mujoco`

# add mujoco to LD_LIBRARY_PATH
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco200/bin

# for GPU rendering (replace 418 with your nvidia driver version or you can make a dummy directory /usr/lib/nvidia-000)
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia-418

# only for a headless server
$ export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so:/usr/lib/nvidia-418/libGL.so
```

2. Install python dependencies
```bash
$ sudo apt-get install cmake libopenmpi-dev libgl1-mesa-dev libgl1-mesa-glx libosmesa6-dev patchelf libglew-dev

# software rendering
$ sudo apt-get install libgl1-mesa-glx libosmesa6 patchelf

# window rendering
$ sudo apt-get install libglfw3 libglew2.0

$ pip install -r requirements.txt
```

