This folder contains the code to reproduce the experiments in the article **Learning guarantee of reward modeling using deep neural networks**

The structure of this repository is as follows:
<working directory>/
```
├─README.md
├─plot_two.py
├─CodeR_BTmodel
    ├─ result/
    ├─ ready.py
    ├─ funcs.py
    ├─ sim_new.py
    ├─ run_toy.sh
    ├─ Regret.png
├─CodeR_THmodel
    ├─ result/
    ├─ ready.py
    ├─ funcs.py
    ├─ sim_new.py
    ├─ run_toy.sh
    ├─ Regret.png
```

- result/: the result files for all  tasks. It is used by the ready.py to generate figures.
- run_toy.sh: scripts to train models and save output results.
- ready.py: code to generate outputs figures.
- sim_new.py: code for training generative models on various datasets.
- funcs.py: functions utilized to generate dataset and network.
- Regret.png: Figures for result visualization.
- plot_two.py: Given all results from both BT models and Thurstonian model are ready, this file visualize results from both models.

Workflow
Preparations
- Install the PyTorch framework, following the installation guides at https://pytorch.org/get-started/locally/.
- Install necessary Python packages as in the Software Environment section.
- Check local working directory and structure of this repository.

Run the code for BT model under the working dictionary:
```
    cd ./CodeR_BTmodel
    bash run_toy.sh
```

Run the result for BT model under the working dictionary:
```
    cd ./CodeR_THmodel
    bash run_toy.sh
```

Plot the Figure under the working dictionary:
```
    python plot_two.py
```


Note.
- The experiments are implemented on the server and we run  100 tasks per batch. 