
## Installation

cd language-model-arithmetic/
pip install -e .

cd ../peft/
pip install -e .

conda install -c nvidia cuda-compiler

cd ..
git clone https://github.com/PKU-Alignment/safe-rlhf.git
cd safe-rlhf
pip install .

cd ..
pip install -r requirements.txt
```

## Preparing Data
```
cd code/data
python relabel.py
```

## Training
```
cd code/training
bash run.sh
```

## Evaluation
```
cd code/evaluation
python generate_outputs.py --model_parm_both_name_or_path /path --alpha_helpfulness 0.5 --alpha_harmlessness 0.5
python compute_reward.py --path /path
```