Model	Overall	Easy	Hard	Short	Medium	Long
Qwen2.5-7B-Instruct_4096	23.86	22.92	24.44	27.22	24.19	17.59
Llama-3.1-8B-Instruct_1024	29.62	28.12	30.55	35.0	27.91	24.07
Qwen2.5-14B-Instruct_2048	35.59	42.71	31.19	38.89	35.81	29.63
Qwen2.5-14B-Instruct_512	35.19	41.67	31.19	39.44	34.42	29.63
Llama-3.1-8B-Instruct_2048	30.42	30.21	30.55	36.11	27.44	26.85
Qwen2.5-14B-Instruct_256	34.0	42.19	28.94	36.11	34.42	29.63
Llama-3.1-8B-Instruct_256	30.42	28.65	31.51	33.89	30.23	25.0
Qwen2.5-14B-Instruct_1024	35.39	43.75	30.23	38.33	35.81	29.63
Llama-3.1-8B-Instruct_4096	30.22	28.12	31.51	36.67	27.44	25.0
Qwen2.5-14B-Instruct_4096	34.19	40.1	30.55	38.89	34.42	25.93
Qwen2.5-7B-Instruct_256	24.65	24.48	24.76	31.67	23.26	15.74
Qwen2.5-7B-Instruct_1024	23.26	22.4	23.79	26.67	23.72	16.67
Llama-3.1-8B-Instruct_512	29.22	27.6	30.23	33.89	27.44	25.0
Qwen2.5-7B-Instruct_2048	23.46	24.48	22.83	27.22	23.26	17.59
Qwen2.5-7B-Instruct_512	23.46	22.92	23.79	31.11	21.4	14.81