# README for Reproducing

## Neo4j Prepration



## Environment Setup
We recommend installing the environment using the `environment.yml` file.
```Bash
conda env create -f environment.yml
```
If the command above fails, you can also build your environment with follwling steps:
```bash
# create new environment
create -n RiskAtlas python=3.11
conda activate RiskAtlas

# Install pytorch
pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu128

# Install packages
pip install -r requirements.txt
pip install --upgrade datasets
```

## Model Prepration
Excute `instruct_finetune_code/llama_3_1_70balpaca_finetune.py` for obtaining `synthesis model` and `obfuscation model`.

Excute `instruct_finetune_code/llama_3_1_8b_alpaca_finetune.py` for obtaining `instrct finetune model`.


## Pipeline Running
**Remenber to start `neo4j` before running pipeline.**
```Bash
# build knowledge graph
python real_pipeline/step1_simple_kg_build.py --domain {your domain}

# load synthesis model and generate harmful prompts
bash scripts/start_vllm_finetune_server.sh
python real_pipeline/step2_harmful_generation.py --domain {your domain}
tmux kill-window -t llama3_1_70b_finetune

# load evaluate model and make toxicity evaluation
bash scripts/start_vllm_granite_guardian_server.sh
python real_pipeline/step3_toxicity_evaluation.py --domain {your domain}
tmux kill-window -t granite_guardian_evaluator

# filter prompts
python real_pipeline/step4_dataset_assembly.py --domain {your domain}

# load obfuscation model and make obfuscation on prompts
bash scripts/start_vllm_finetune_server.sh
python real_pipeline/step5_optimized_batch.py --domain {your domain}
tmux kill-window -t llama3_1_70b_finetune

# generate final domain dataset
python real_pipeline/step6_dataset_finalization.py --domain {your domain}
```

If you want to run the pipeline in one go, you can use the bash `run_piepeline.sh`.

```Bash
bash run_pipeline.sh --domain {your domain}
```

## Test

### Experiment directory tree
```Text
--experiment--exp1_exp2_dataset # datasets of public benchmarks and our random selected data
           |--exp3_dataset--{domain name}--final*.json # RiskAtlas final dataset
                                        |--step3*.json # original dataset with toxicity score
           |--exp4_dataset--RiskAtlas_medicine_prompts_sampled.json
                         |--without_kg_step2_data.json
           |--exp1_dataset.py # code for experiment 1
           |--exp2_finetune.py # finetune code for experiment 2
           |--exp2_evaluate.py # evaluation for experiment 2
           |--exp3_multi_domain.py # code for experiment 3
           |--exp4_ablation_kg.py # code for experiment 4
```


### Experiment 1: valuation of ASR on public benchmarks and our RiskAtlas

```Bash
python experiment/exp1_dataset.py
```

### Experiment 2: safety finetune
model finetune
```Bash

python experiment/exp2_finetune.py --finetune_dataset {finetune dataset neme}
```
The safety finetuned model will be saved in `experiment/finetune_model`.

Then fill in the path parameters and run the bash `scripts/start_vllm_Llama-3.1-8B-finetune.sh` for loading safety finetuned model.

Lastly, evaluate the safety finetuned model on unseen datasets.
```Bash
python experiment/exp2_evaluate.py --model {safety finetuned model name} --dataset {attack dataset name} --finetune_dataset {finetune dataset name}
```

If you want to evaluate `LLMU` of the safety finetuned model.
```Bash
conda activate RiskAtlas

git clone --depth 1 https://github.com/EleutherAI/lm-evaluation-harness
cd lm-evaluation-harness

pip install -e .
pip install --upgrade datasets

lm_eval --model hf --model_args pretrained={safety finetuned model path},parallelize=True,load_in_4bit=True,peft={LoRA weigh path} --tasks mmlu --num_fewshot 5 --device cuda:0 --batch_size 8 --output_path <your output path>
```

### Experiment 3: metric evaluation for multi domains
To evaluate toxicity score and diversity of the generated domain datasets:
```Bash
python experiment/exp3_multi_domain.py --domain {your domain}
```

### Experiment 4: comparison between the results with and w/o knowledge graph
```Bash
python experiment/exp4_ablation_kg.py
```

