# COMPASS: Training-Free Guidance for Skill Discovery with Human Feedback

This is the official implementation of COMPASS.

## Requirements

Install basic packages from [METRA](https://github.com/seohongpark/METRA)

```
pip install -r requirements.txt --no-deps
```

Installing customized packages (please run them in order and ignore any warnings about incompatible versions):

```
pip install -e envs/safety-gym
pip install -e garaged
pip install --upgrade joblib
pip install patchelf
```


## Run experiments

All train scripts are stored in `scripts/`. Below are some examples of the scripts. 

Train COMPASS for Ant North task.
```bash
bash scripts/pretrain/metra_pref_query/metra_pref_query_main.sh ant n 1 0 0 2 20 2000 500 100 10 2000 5 0
# bash scripts/pretrain/metra_pref_query/metra_pref_query_main.sh env pref_task pref_coef \
#      seed device dim_option query_segmentlen query_warmup query_freq query_limit \
#      query_batchsize query_method weight_smooth_decay_speed discrete
```

Train Oracle (GSD with oracle guidance signal) for HalfCheetah Not-Flip task.
```bash
bash scripts/pretrain/metra_pref/metra_pref_discrete.sh half_cheetah not_flip 1 0 0
# bash scripts/pretrain/metra_pref_query/metra_pref_query_main.sh env pref_task pref_coef seed device 
```

## Acknowledgement

This code repo based on [CSF repo](https://github.com/Princeton-RL/contrastive-successor-features), and benefits from the following repos. Thanks for their wonderful work.
* Safety Gym: https://github.com/openai/safety-gym