model_name,thinking_mode,model_family,instruction_tuned,source,ASDiv,Avg,DS-1000,GSM-Hard,GSM8K,HumanEval-Multilingual,HumanEval-Python,MATH,MAWPS,MBPP,SVAMP,TabMWP
DeepSeek/DeepSeek-Coder-Base-1.3B,Non-thinking,DeepSeek,No,DeepSeek-Coder Report Table 8 (Math reasoning base models),48.2,31.9,,14.5,14.6,,,16.8,62.3,,36.7,30.0
DeepSeek/DeepSeek-Coder-Base-6.7B,Non-thinking,DeepSeek,No,DeepSeek-Coder Report Table 8 (Math reasoning base models),67.2,54.7,,40.3,43.2,,,19.2,87.0,,58.4,67.9
DeepSeek/DeepSeek-Coder-Base-33B,Non-thinking,DeepSeek,No,DeepSeek-Coder Report Table 8 (Math reasoning base models),76.7,65.8,,54.1,60.7,,,29.1,93.3,,71.6,75.3
DeepSeek/DeepSeek-Coder-Base-1B,Non-thinking,DeepSeek,No,DeepSeek-Coder Report (Coding performance),,,16.2,,,28.3,34.8,,,46.2,,
DeepSeek/DeepSeek-Coder-Base-7B,Non-thinking,DeepSeek,No,DeepSeek-Coder Report (Coding performance),,,30.5,,,44.7,49.4,,,60.6,,
DeepSeek/DeepSeek-Coder-Base-33B,Non-thinking,DeepSeek,No,DeepSeek-Coder Report (Coding performance),,,40.2,,,50.3,56.1,,,66.0,,
DeepSeek/DeepSeek-Coder-Instruct-7B,Non-thinking,DeepSeek,Yes,DeepSeek-Coder Report (Coding performance),,,,,,66.1,78.6,,,65.4,,
DeepSeek/DeepSeek-Coder-Instruct-33B,Non-thinking,DeepSeek,Yes,DeepSeek-Coder Report (Coding performance),,,,,,69.2,79.3,,,70.0,,
