
# Requirements
Requirements can be installed from using `pip -r requirements.txt`

# Dataset creation
Preference data creation is handled by `trained_calibration/rl/dataset/dpo_dataset.py` which takes a configuration file. Configurations are stored in `trained_catlibration/configs` and are in yaml format. 
After preference data creation, preference data should be stored in a `data` directory. 

# Training
Models can be trained using using the training functionality in `trained_calibration/rl/train_dpo.py`. The options can be listed using `trained_calibration/rl/train_dpo.py -h`

# Decoding 
After training, models can be decoded and scored using the `trained_calibration/rl/evaluate_dpo.py` file. 

# Evaluation 
Final evaluation scripts are in `trained_calibration/eval/dpo_eval.py`
