abs-bvv-5 Total parameters:     2.1B
abs-bvv-5 MMLU [high_school_european_history]: 13.82% ± 1.27% (σ=2.04%)
abs-bvv-5 MMLU [business_ethics]: 17.90% ± 2.16% (σ=3.48%)
abs-bvv-5 MMLU [clinical_knowledge]: 25.92% ± 1.42% (σ=2.30%)
abs-bvv-5 MMLU [medical_genetics]: 22.80% ± 2.61% (σ=4.21%)
abs-bvv-5 MMLU [high_school_us_history]: 14.22% ± 1.22% (σ=1.97%)
abs-bvv-5 MMLU [high_school_physics]: 20.99% ± 1.22% (σ=1.97%)
abs-bvv-5 MMLU [high_school_world_history]: 14.51% ± 0.77% (σ=1.24%)
abs-bvv-5 MMLU [virology]: 20.60% ± 2.17% (σ=3.50%)
abs-bvv-5 MMLU [high_school_microeconomics]: 26.68% ± 2.05% (σ=3.31%)
abs-bvv-5 MMLU [econometrics]: 25.35% ± 1.97% (σ=3.17%)
abs-bvv-5 MMLU [college_computer_science]: 20.60% ± 1.39% (σ=2.24%)
abs-bvv-5 MMLU [high_school_biology]: 23.48% ± 1.48% (σ=2.39%)
abs-bvv-5 MMLU [abstract_algebra]: 9.20% ± 1.03% (σ=1.66%)
abs-bvv-5 MMLU [professional_accounting]: 19.15% ± 1.09% (σ=1.75%)
abs-bvv-5 MMLU [philosophy]: 23.02% ± 1.10% (σ=1.77%)
abs-bvv-5 MMLU [professional_medicine]: 29.93% ± 1.55% (σ=2.50%)
abs-bvv-5 MMLU [nutrition]: 22.09% ± 1.02% (σ=1.65%)
abs-bvv-5 MMLU [global_facts]: 14.60% ± 1.76% (σ=2.84%)
abs-bvv-5 MMLU [machine_learning]: 12.68% ± 1.37% (σ=2.22%)
abs-bvv-5 MMLU [security_studies]: 18.86% ± 0.89% (σ=1.43%)
abs-bvv-5 MMLU [public_relations]: 23.64% ± 2.05% (σ=3.30%)
abs-bvv-5 MMLU [professional_psychology]: 18.61% ± 0.50% (σ=0.80%)
abs-bvv-5 MMLU [prehistory]: 19.01% ± 0.65% (σ=1.05%)
abs-bvv-5 MMLU [anatomy]: 23.11% ± 1.00% (σ=1.62%)
abs-bvv-5 MMLU [human_sexuality]: 25.80% ± 1.21% (σ=1.96%)
abs-bvv-5 MMLU [college_medicine]: 26.24% ± 1.40% (σ=2.26%)
abs-bvv-5 MMLU [high_school_government_and_politics]: 22.90% ± 1.85% (σ=2.99%)
abs-bvv-5 MMLU [college_chemistry]: 24.90% ± 2.29% (σ=3.70%)
abs-bvv-5 MMLU [logical_fallacies]: 17.06% ± 1.28% (σ=2.07%)
abs-bvv-5 MMLU [high_school_geography]: 26.06% ± 0.98% (σ=1.58%)
abs-bvv-5 MMLU [elementary_mathematics]: 19.52% ± 1.15% (σ=1.85%)
abs-bvv-5 MMLU [human_aging]: 20.90% ± 1.77% (σ=2.85%)
abs-bvv-5 MMLU [college_mathematics]: 22.10% ± 2.06% (σ=3.33%)
abs-bvv-5 MMLU [high_school_psychology]: 25.05% ± 1.18% (σ=1.90%)
abs-bvv-5 MMLU [formal_logic]: 22.14% ± 2.30% (σ=3.71%)
abs-bvv-5 MMLU [high_school_statistics]: 22.08% ± 1.31% (σ=2.11%)
abs-bvv-5 MMLU [international_law]: 6.20% ± 1.17% (σ=1.89%)
abs-bvv-5 MMLU [high_school_mathematics]: 18.85% ± 1.01% (σ=1.63%)
abs-bvv-5 MMLU [high_school_computer_science]: 15.30% ± 1.24% (σ=2.00%)
abs-bvv-5 MMLU [conceptual_physics]: 25.15% ± 0.75% (σ=1.21%)
abs-bvv-5 MMLU [miscellaneous]: 19.35% ± 0.65% (σ=1.04%)
abs-bvv-5 MMLU [high_school_chemistry]: 22.32% ± 1.43% (σ=2.30%)
abs-bvv-5 MMLU [marketing]: 20.38% ± 1.62% (σ=2.61%)
abs-bvv-5 MMLU [professional_law]: 12.93% ± 0.34% (σ=0.54%)
abs-bvv-5 MMLU [management]: 24.27% ± 1.55% (σ=2.49%)
abs-bvv-5 MMLU [college_physics]: 24.90% ± 1.42% (σ=2.29%)
abs-bvv-5 MMLU [jurisprudence]: 19.91% ± 1.52% (σ=2.46%)
abs-bvv-5 MMLU [world_religions]: 16.14% ± 1.29% (σ=2.08%)
abs-bvv-5 MMLU [sociology]: 20.15% ± 1.54% (σ=2.49%)
abs-bvv-5 MMLU [us_foreign_policy]: 20.30% ± 2.35% (σ=3.80%)
abs-bvv-5 MMLU [high_school_macroeconomics]: 26.54% ± 0.90% (σ=1.45%)
abs-bvv-5 MMLU [computer_security]: 17.90% ± 1.67% (σ=2.70%)
abs-bvv-5 MMLU [moral_scenarios]: 22.02% ± 0.65% (σ=1.05%)
abs-bvv-5 MMLU [moral_disputes]: 21.10% ± 1.04% (σ=1.69%)
abs-bvv-5 MMLU [electrical_engineering]: 19.03% ± 1.42% (σ=2.29%)
abs-bvv-5 MMLU [astronomy]: 19.67% ± 1.67% (σ=2.69%)
abs-bvv-5 MMLU [college_biology]: 20.42% ± 1.55% (σ=2.51%)
abs-bvv-5 MMLU: 20.33% ± 0.21% (σ=0.34%)
abs-bvv-5 ARC-e: 20.42% ± 0.56% (σ=0.90%)
abs-bvv-5 ARC-c: 23.24% ± 0.80% (σ=1.30%)
abs-bvv-5 C-SENSE: 19.80% ± 0.62% (σ=1.00%)
abs-bvv-5 SQUAD: 2.30% ± 0.45% (σ=0.73%)