# Setup
First, set up one conda environment for all language models, except for gpt-oss-20b, using requirements.txt.
Second, set up another conda environment that will enable running gpt-oss-20b using requirements_gpt5.txt.

# Generating Datasets
To generate the BeaverTails, RealToxicity, and UltraSafety datasets, refer to the `data/` directory. For stacking any shards, use `data/stack_finetune_shards.py`.

# Training BRT-Align Safety Value Function
To train the BRT-Align approaches, refer to the folder called `toxicity_value_function`.
To train the Sample-BRT-Align method, use `toxicity_value_function/train_sample_brt.py`.

# Runtime Monitoring
To run the runtime monitoring, refer to `runtime_monitoring/final_runtime_monitoring_evaluation.py`.
To generate the runtime monitoring plots, refer to `runtime_monitoring/plot_balanced_runtime_monitor_results.py`

# Safety Alignment
To run LLM safety alignment, refer to the `alignment` folder. Each type of aligner is specified in `alignment/aligners`. We provide analysis metrics in `alignment/analysis`. 

To run alignment, use `alignment/run_alignment.py`. To load and compute the alignment metrics, refer to `alignment/load_and_compute_final_alignment_results.py`. To print the results, refer to `alignment/print_alignment_results_table.py`.

To reproduce the results of the XSTest experiment, run `alignment/interactive_prompt_alignment.py` and set `--dataset_name` as XSTest.
