## Overview

This folder contains all the evaluation benchmarks (AGIEval, MMLU, ARC, Race, BBH, Commonsense) and our constructed AbsR. All files are prepared for PPL-based evaluation. AbsR also contains the training set for training MeanLearn.





## Structure

```bash
agieval # this folder contains the ppl-based evaluation format aigeval dataset
arc # this folder contains the ppl-based evaluation format arc dataset
bbh # this folder contains the ppl-based evaluation format bbh dataset
commonsense # this folder contains the ppl-based evaluation format commonsense dataset
mmlu # this folder contains the ppl-based evaluation format mmlu dataset
race # this folder contains the ppl-based evaluation format race dataset

AbsR
 --ablation # contains training data for ablation studies
 --test # the test set of AbsR
 --train.jsol # the train set of AbsR, which is formatted as jsonlines
```





## Statistics

Please refer to the paper.



## Some Samples of AbsR

![image-20240520212233804](/Users/kaixiong/Downloads/Ph.D./Paper_Work/Meaningful Learning/submission/supplementary material/data/assets/image-20240520212233804.png)

