# Boosting Offline Multi-Objective Reinforcement Learning via Preference Conditioned Diffusion Models

This repository contains reference implementation for paper **Boosting Offline Multi-Objective Reinforcement Learning via Preference Conditioned Diffusion Models**.
## Install
Create conda environment:
  ```
  cd diffmorl
  conda env create -f environment.yml
  conda activate diffmorl_env
  ```
Install the diffuser for DiffMORL:
  ```
  cd diffuser
  python -m pip install -e .
  ```
## Data Download or Generation
You can download datasets from the [PEDA](https://github.com/baitingzbt/PEDA) repo following their instructions. Due to storage limit, we are unable to open-source all data variants. We recommend users to download the pretrained behavioral policies from the [PEDA](https://github.com/baitingzbt/PEDA) repo and generate all data following the examples in `data_generation/collect_all.sh` and `data_generation/collect_custom.sh`. Note that `custom` include the `incomplete dataset` used in our paper, which is tagged as `custom-large`. Other types of incomplete datasets can also be collected by modifying the settings in `data_generation/custom_pref.py`

## Training and Evaluation
You shoulf include the path in your `PYTHONPATH` environment variables by running
```
export PYTHONPATH=<path-to-diffmorl>
```


One example here for a single experiment:
```
python experiment.py --dir experiment_runs/example --env MO-HalfCheetah-v2 --seed 2 --dataset expert_custom --model_type mod --mod_type bc --num_steps_per_iter 400000 --max_iters 1 --use_p_bar True --K 8 --infer_N 7 --n_diffusion_steps 8 --returns_condition True --mixup True --mixup_num 6 --mixup_step 400000
```
Other example commands are included in `scripts/examples.sh`. To reproduce our results (after you have collected all datasets to be used), run
```
sh scripts/diffmorl_main.sh
```
Double-check your CUDA device and data path in the shell scripts.
After training, models will be evaluated automatically. The Pareto fronts and all metrics will be presented in the directory specified by `--dir`. Also, the models are saved to the same directory. 
