# Backdoor_finetuning for highway dataset

## Data generation

first generate the data for finetuning (benignly and backdoor). The functions in "scenario_generator_xxx" is to generate data for different settings.

For example for GPT, run the following code to generate data: 


```bash
python ./data_generation/scenario_generator_gpt.py
```


For example, if you want to generate naive backdoor attack where trigger words are in question, just call function "train_benign_generator" and train_backdoor_generator" and then mix and store the data (as shown in the main function)

## fine-tune the model and set the config using fine_tune_gpt.ipynb (remember to update the file name)

## test the backdoor/benign models

```bash
python ./test/gpt_test_finetune.py
```

## For RAG setting, the simialr data generator, test and sentence-bert retriever are implemented under the folder RAG.


 