# HuatuoGPT2, One-stage Training for Medical Adaption of LLMs
### Data Unification

<div align=center>
<img src="assets/figure4.png"  width = "50%" alt="HuatuoGPT2" align=center/>
</div>

- HuatuoGPT2 transforms the pre-training corpus into  (instruction, output) pairs using LLM. Utilize the script for Data Unification.

```Bash
python adapation/data_unification/rewrite.py
```

### One-stage training
<div align=center>
<img src="assets/figure3.png"  width = "50%" alt="HuatuoGPT2" align=center/>
</div>

- We introduce a priority sampling approach, pre-processing data with this algorithm:

```bash
python adapation/one_stage_training/data_process.py
```

- Then, training is conducted using one-stage training:

```Bash
bash adapation/one_stage_training/train.sh
```

By adopting the One-stage Adaptation method, you will observe the following loss curve:

<div align=center>
<img src="assets/loss.png"  width = "50%" alt="HuatuoGPT2" align=center/>
</div>



### Automated Evaluation of Medical Response Quality

- Single-turn response evaluation using **GPT-4**:

```bash
python evaluation/eval_huatuo_inst.py
```

- Multi-turn dialogue evaluation using **GPT-4**:

```bash
python evaluation/eval_huatuo_conv.py
```
