# Code base for Submission

# Appendix : 
Can be found in the accompanying pdf.


# Installation Instructions : 
Include stablebaselines3 and rlkit in the scope of the code.

```cmd
pip install torch matplotlib 'gym[mujoco,atari]' hydra-core==1.0.6 tensorboard pybullet tqdm termcolor scikit-image scikit-learn
pip install git+https://github.com/rlworkgroup/metaworld.git@master#egg=metaworld
pip install protobuf==3.20
pip install gym[accept-rom-license]
```

Install DMControl and DMC2Gym as in B-Pref
```
conda env create -f conda_env.yml
pip install -e .[docs,tests,extra]
cd custom_dmcontrol
pip install -e .
cd custom_dmc2gym
pip install -e .
pip install git+https://github.com/rlworkgroup/metaworld.git@master#egg=metaworld
pip install pybullet
```

## Run Experiments 
Important flags : 
adloss
tloss
l2embed
rdynamics
surf_loss

All of the above values can be none / base. If all the flags are none then the train_PEBBLE.py trains baseline PEBBLE.
Refer to configs/train_PEBBLE.yaml for more config parameters.

## Run User Webpage
Run test_train_env_webpage.py to try the web-page for human study interaction with random trajectory samples.

## irrational teacher
`Mistake teacher`: (teacher_beta=-1, teacher_gamma=1, teacher_eps_mistake=0.1, teacher_eps_skip=0, teacher_eps_equal=0)
