## Training

### Fine-tune an LLM on GSM8K

```bash
cd TVM
bash scripts/gsm8k/train_generator.sh
```


#### Generate reasoning paths

```bash
cd TVM
bash scripts/gsm8k/generate.sh
```

Generated reasoning paths will be saved to `data/gsm8k/model_generation/`


#### Train TVM

```bash
cd TVM
bash scripts/gsm8k/train_verifier.sh
```



## Inference

### Best-of-N Search

1. With `--target_set test`,
    ```bash
    cd TVM
    bash scripts/gsm8k/generate.sh
    ```

2. Then,
    ```bash
    cd TVM
    bash scripts/gsm8k/eval_with_verifier.sh
    ```

The output will be saved to `eval_results/gsm8k/verifier/test` 


### Step-by-Step Beam Search

```bash
cd TVM
bash scripts/gsm8k/eval_step_beam.sh
```

The output will be saved to `eval_results/gsm8k/generator_with_verifier/test` 
