## Environment
```
pip install --upgrade unsloth unsloth_zoo
pip install trl==0.19.1
pip install seaborn
pip install fire
pip install jsonlines
pip install datasets
pip install xgrammar==0.1.18
pip install xformers==0.0.29.post2
pip install transformers==4.53.2
pip install deepspeed==0.14.4
pip install bitsandbytes==0.45.3
pip install vllm==0.8.5
pip install datasets==4.0.0
pip install sentence_transformers
pip install --upgrade pandas
```
## SQV_Alignment

### Stage 1: Data Construction
1. Generates contexts with injected noise using Deepseek-R1. Specify $NOISY_TYPE and $DS_PATH in the script first.
```
python src/data/gen_ad_context.py 
```
2. Generates reasoning trajectories in SQV format using Deepseek-R1.
```
python src/data/gen_cot.py
```
### Stage 2: SFT
Users then perform SQV-Alignment using LoRA on the synthetic dataset. Run the following command to start training.
```
python src/sqv_align.py 
```
## Inferring
Users should specify the test dataset name and the ckpt_path of the adpater. Run the following command to start inferring.
```shell
test_name=""
ckpt_path=""
python src/sqv_infer.py \
    --test  $test_name\
     --ckpt  $ckpt_path\
```
## 📈 Key Features
✅ **Adversarial Robustness**: Trains models to maintain faithfulness under noisy contexts

✅ **Single-Pass Correction**: Self-verification during generation without iterative refinement

✅ **Efficient Knowledge Transfer**: LoRA alignment reduces trainable parameters by 95% vs full fine-tuning