# Supplementary Material

This package reproduces the LoRA fine-tuning and inference pipeline described in our submission.  
It is **self-contained** (≤ 100 MB) thanks to:

* A 90 MB XZ-compressed LoRA adapter (`weights/adapter_model.safetensors.xz`)
* Scripts that **download the 32-B base model** from Hugging Face and **unpack** the adapter automatically

> Tested on 8 × A100-80 GB, CUDA 12.2, Ubuntu 22.04, Python 3.10.

---

## Quick Start

```bash
# 0. decompress this ZIP
unzip supplementary_material.zip
cd DeepSeek_LoRA_Release          # stay here 

# 1. (once) install dependencies
conda create -n deepseek_lora python=3.10 -y
conda activate deepseek_lora
pip install -r requirements.txt

# 2. (once) download 240 GB base model + verify checksum
bash DOWNLOAD_BASE_MODEL.sh

# 3. training / merging  (optional)
cd LLMTrainProject/script_sy
bash run_llm_sft_lora.sh      # LoRA fine-tuning
bash run_llm_merge.sh         # merge adapter into base (creates full model)

# 4. inference service
cd ../../LLMInferProject/script
bash run_llm_server.sh        # vLLM API at 
bash run_llm_web.sh           # simple web demo at 

