Minimal code for TrojanPraise (under submission) showing that targeted fine-tuning can degrade Llama 2 7B Chat alignment. This release only includes the basic training script.

### Run
```bash
python ./finetune.py
```
Outputs will be saved to `poc/result/<timestamp>`.

### Note
During training, a simple callback prints responses to two unsafe prompts to illustrate alignment degradation.
The data is stored in a pkl format, and we have printed the each QA pairs in data.txt for illustration.

### Ethics
For research on robustness and safety only. Do not use to cause harm.