## Installing dependence

conda create --name GDPO --file spec-list.txt

pip install requirements.txt

If it still not work, please refer to DiGress repo to add required dependence.

## run the toy experiments

To run GDPO, change the train_method as "olppox0" in run_ppo_toy.sh

To run DDPO, change the train_method as "olppo" in run_ppo_toy.sh

Then,

bash run_ppo_toy.sh

For other experiments, please check ./configs/experiments to prepare dataset and pretrained models.