# How to Evaluate

1. Evaluate Model with Critique Task 

```shell
cd src;
./run_critique_tuned.sh
```

关键在于如何prompt不同的model得到critique数据

* Auto-J
* TigerScore
* UltraCM
* InternLM2 (baseline)
* Our model

2. Evaluate Model with Correction Task

```shell
cd src;
./run_critique_tuned.sh
```
