**We include the following datasets linked to our best understanding of their original source.**

[GSM8K](https://github.com/openai/grade-school-math/tree/3101c7d5072418e28b9008a6636bde82a006892c)

[MMLU](https://github.com/hendrycks/test)

[Mini Crosswords](https://www.goobix.com/crosswords/0505/)

[GAIA](https://huggingface.co/datasets/gaia-benchmark/GAIA)

[HumanEval](https://github.com/openai/human-eval)
