Model,LiveBench model ID (version),Overall,Coding,Data Analysis,Instruction Following,Language,Math,Reasoning
Claude-3 Haiku,claude-3-haiku-20240307,35.3,24.5,41.5,64.0,30.1,25.7,26.0
Claude-3 Sonnet,claude-3-sonnet-20240229,38.1,25.2,44.6,65.0,38.1,29.6,26.0
Claude-3 Opus,claude-3-opus-20240229,50.8,40.1,54.3,70.9,51.7,46.5,41.0
Claude-3.5 Sonnet,claude-3-5-sonnet-20240620,61.2,63.2,56.7,72.3,56.9,53.7,64.0
Cohere Command R,command-r,27.2,14.9,31.7,57.2,14.6,16.9,28.0
Cohere Command R+,command-r-plus,32.9,20.3,24.6,71.5,23.9,24.9,32.0
Gemini 1.0 Pro,gemini-1.0-pro-instruct,40.2,31.8,29.6,68.3,29.7,43.4,42.0
Gemini 1.5 Flash,gemini-1.5-flash-api-0514,40.9,39.1,44.0,63.0,30.7,38.5,30.0
Gemini 1.5 Pro,gemini-1.5-pro-api-0514,44.4,32.8,52.8,67.2,38.3,42.1,33.0
GPT-3.5-turbo,gpt-3.5-turbo-0125,34.4,29.2,41.2,60.5,24.2,25.5,26.0
GPT-4-turbo,gpt-4-turbo-2024-04-09,53.0,47.1,51.3,71.4,45.3,49.0,54.0
GPT-4o,gpt-4o-2024-05-13,55.0,46.4,52.4,72.2,53.9,49.9,55.0
Llama 3 8B Instruct,meta-llama-3-8b-instruct,26.7,18.3,23.3,57.1,18.7,17.6,25.0
Llama 3 70B Instruct,meta-llama-3-70b-instruct,37.4,20.9,42.4,63.5,34.1,32.3,31.0
Mistral Large,mistral-large-2402,38.9,26.8,42.6,68.2,32.7,32.2,35.0
Mixtral 8×22B MoE,mixtral-8x22b-instruct-v0.1,34.8,33.1,30.3,63.2,26.5,26.9,29.0
