## GM Labeled Dataset

To reproduce the Gold Model labeled dataset in the paper, use `create_pairwise_inference.py` with an initial SFT policy to generate pairs of completions for each prompt. Then use `create_gold_model_dataset.py` to score the completions with the Gold Model. Finally use `create_gm_labeled_dataset.ipynb` to create the dataset.