Required Library: NumPy 1.26.4, Matplotlib 3.9.1

file							folder containing raw experiment results
AlgorithmNumpy.py				Algorithm classes for ZSPO and baselines
Env.py						Environment classes
Panel.py						Preference Feedback classes
Main-DPO.py					Run DPO algorithm and collect results
Main-ODPO.py					Run Online DPO algorithm and collect results
Main-RM.py					Train a reward model
Main-PPO.py					Run PPO algorithm and collect results
Main-ZPG.py					Run ZPG algorithm and collect results
Main-ZSPO.py					Run ZSPO algorithm and collect results
Plot-Figure-Reward.ipynb			Plot figures



