
This repository contains code and experiments for training and testing Fairness Reward Models. Our implementation is based on the test-time compute approach from Beeching et al. (2024).

@misc{beeching2024scalingtesttimecompute,
      title={Scaling test-time compute with open models},
      author={Edward Beeching and Lewis Tunstall and Sasha Rush},
      url={https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute},
}
``

## Configuration

The project uses a modular configuration system located in `src/sal/config/`. Configuration can be specified in two ways:

1. Using a YAML file:
```yaml
model:
  model_path: "meta-llama/Llama-3.2-1B-Instruct"
  gpu_memory_utilization: 0.6
  prm_paths: [PATH]
  custom_chat_template: null  # Optional: Override the default chat template

dataset:
  name: "LabHC/bias_in_bios"
  split: "train"

output:
  push_to_hub: true
  hub_dataset_private: true
  apply_voting: true

search:
  approach: "best_of_n"
  n: 4
  temperature: 0.8  # Temperature for VLLM text generation
  top_p: 1.0
  prm_batch_size: 2
  search_batch_size: 25
  seed: 42
  max_tokens: 2048
  agg_strategy: "log_sum"
  math_temperature: 0.5  # Temperature for weighted sum calculations (lower = sharper distribution)
```

2. Using a recipe name:
```bash
python scripts/test_time_compute.py --recipe "llama_best_of_n"
```

## Running the Code

### Test Time Computation

To run test time computation with a configuration:

```bash
python scripts/test_time_compute.py --config path/to/config.yaml
```

Or using a recipe:

```bash
python scripts/test_time_compute.py --recipe "llama_best_of_n"
```

