# Supplementary Materials for Step-controlled DPO

- code: https://anonymous.4open.science/r/Step-controlled-DPO_supplementary-code-FF97

- model weights of InternLM2-SFT-SCDPO: https://huggingface.co/StepControlled/InternLM-20B-SCDPO

- SCDPO data for Mistral-7B-Ours: https://huggingface.co/datasets/StepControlled/SCDPO-Data-Mistral-Ours

- DPO data for Mistral-7B-Ours: https://huggingface.co/datasets/StepControlled/DPO-Data-Mistral-Ours

- 81K tool-integrated SFT data for Mistral-7B-Ours: https://huggingface.co/datasets/StepControlled/MATH-GSM8K-Tool-81K

All the repositories above are anonymous and no information of the authors is disclosed.

Code is also stored under `code/Step-Controlled_DPO`, and SCDPO data for Mistral-7B-Ours is stored under `data/Data-Mistral-7B-Ours-SCDPO`