,OpenAI,,,Anthropic,,Google,,,Meta,,,Mistral AI,,,Alibaba,,,,DeepSeek AI,
,GPT-3.5-turbo,GPT-4o-mini,GPT-4o,Claude-3-Haiku,Claude-3-Sonnet,Gemini-1.5-Pro,Gemini-1.5-Flash,Gemma-2-9b-it,LlaMA-3.1-8b-Instruct,LlaMA-3.1-70B-Instruct,LlaMA-3.1-405B-Instruct,7B-Instruct,Codestral,Large-V2,Qwen2-1.5B-instruct,Qwen2.5-Coder-7B-Instruct,Qwen2.5-7B-Instruct,Qwen2.5-72B-Instruct,Coder-V2,Chat-V2
Parsable,63.80%,95.68%,88.77%,90.56%,67.28%,78.85%,73.31%,52.28%,71.24%,72.56%,72.83%,12.96%,88.81%,71.35%,0.01%,74.10%,81.16%,83.53%,82.17%,82.55%
Solvable,17.63%,15.56%,33.35%,25.52%,32.72%,44.30%,24.10%,14.75%,21.78%,23.03%,43.69%,2.07%,27.90%,28.93%,0.00%,29.44%,11.87%,29.86%,20.36%,25.40%
Equivalent,3.43%,3.17%,27.50%,0.05%,18.86%,11.38%,0.00%,7.45%,0.02%,8.10%,31.95%,1.27%,7.74%,15.99%,0.00%,11.42%,0.00%,12.94%,19.68%,13.23%
