abs-bvv-6 Total parameters:     2.3B
abs-bvv-6 MMLU [high_school_european_history]: 14.42% ± 1.09% (σ=1.75%)
abs-bvv-6 MMLU [business_ethics]: 19.40% ± 2.10% (σ=3.38%)
abs-bvv-6 MMLU [clinical_knowledge]: 25.55% ± 0.95% (σ=1.53%)
abs-bvv-6 MMLU [medical_genetics]: 27.60% ± 1.88% (σ=3.04%)
abs-bvv-6 MMLU [high_school_us_history]: 17.11% ± 1.21% (σ=1.95%)
abs-bvv-6 MMLU [high_school_physics]: 23.77% ± 2.07% (σ=3.34%)
abs-bvv-6 MMLU [high_school_world_history]: 15.40% ± 0.69% (σ=1.12%)
abs-bvv-6 MMLU [virology]: 24.76% ± 2.03% (σ=3.28%)
abs-bvv-6 MMLU [high_school_microeconomics]: 25.04% ± 1.90% (σ=3.07%)
abs-bvv-6 MMLU [econometrics]: 29.91% ± 1.60% (σ=2.59%)
abs-bvv-6 MMLU [college_computer_science]: 19.50% ± 1.47% (σ=2.38%)
abs-bvv-6 MMLU [high_school_biology]: 24.19% ± 1.15% (σ=1.86%)
abs-bvv-6 MMLU [abstract_algebra]: 9.20% ± 1.51% (σ=2.44%)
abs-bvv-6 MMLU [professional_accounting]: 21.56% ± 0.89% (σ=1.43%)
abs-bvv-6 MMLU [philosophy]: 23.99% ± 0.96% (σ=1.54%)
abs-bvv-6 MMLU [professional_medicine]: 29.74% ± 1.60% (σ=2.59%)
abs-bvv-6 MMLU [nutrition]: 24.18% ± 1.42% (σ=2.29%)
abs-bvv-6 MMLU [global_facts]: 20.50% ± 1.84% (σ=2.97%)
abs-bvv-6 MMLU [machine_learning]: 14.11% ± 1.67% (σ=2.70%)
abs-bvv-6 MMLU [security_studies]: 18.86% ± 0.79% (σ=1.28%)
abs-bvv-6 MMLU [public_relations]: 27.91% ± 1.95% (σ=3.15%)
abs-bvv-6 MMLU [professional_psychology]: 20.38% ± 0.73% (σ=1.19%)
abs-bvv-6 MMLU [prehistory]: 20.99% ± 0.66% (σ=1.06%)
abs-bvv-6 MMLU [anatomy]: 21.33% ± 1.73% (σ=2.79%)
abs-bvv-6 MMLU [human_sexuality]: 26.49% ± 1.67% (σ=2.69%)
abs-bvv-6 MMLU [college_medicine]: 26.47% ± 1.37% (σ=2.21%)
abs-bvv-6 MMLU [high_school_government_and_politics]: 24.25% ± 1.54% (σ=2.48%)
abs-bvv-6 MMLU [college_chemistry]: 26.20% ± 2.61% (σ=4.21%)
abs-bvv-6 MMLU [logical_fallacies]: 17.61% ± 1.15% (σ=1.86%)
abs-bvv-6 MMLU [high_school_geography]: 26.36% ± 1.53% (σ=2.47%)
abs-bvv-6 MMLU [elementary_mathematics]: 22.17% ± 0.56% (σ=0.91%)
abs-bvv-6 MMLU [human_aging]: 23.14% ± 1.10% (σ=1.77%)
abs-bvv-6 MMLU [college_mathematics]: 24.90% ± 1.12% (σ=1.81%)
abs-bvv-6 MMLU [high_school_psychology]: 27.50% ± 0.92% (σ=1.48%)
abs-bvv-6 MMLU [formal_logic]: 24.44% ± 1.39% (σ=2.24%)
abs-bvv-6 MMLU [high_school_statistics]: 23.15% ± 1.03% (σ=1.66%)
abs-bvv-6 MMLU [international_law]: 6.20% ± 1.03% (σ=1.66%)
abs-bvv-6 MMLU [high_school_mathematics]: 23.00% ± 1.09% (σ=1.76%)
abs-bvv-6 MMLU [high_school_computer_science]: 14.80% ± 2.33% (σ=3.76%)
abs-bvv-6 MMLU [conceptual_physics]: 26.26% ± 1.04% (σ=1.67%)
abs-bvv-6 MMLU [miscellaneous]: 21.30% ± 0.55% (σ=0.89%)
abs-bvv-6 MMLU [high_school_chemistry]: 24.53% ± 1.69% (σ=2.72%)
abs-bvv-6 MMLU [marketing]: 21.97% ± 1.55% (σ=2.49%)
abs-bvv-6 MMLU [professional_law]: 13.46% ± 0.54% (σ=0.87%)
abs-bvv-6 MMLU [management]: 28.74% ± 2.54% (σ=4.10%)
abs-bvv-6 MMLU [college_physics]: 25.20% ± 1.44% (σ=2.32%)
abs-bvv-6 MMLU [jurisprudence]: 23.33% ± 2.10% (σ=3.38%)
abs-bvv-6 MMLU [world_religions]: 16.67% ± 0.78% (σ=1.26%)
abs-bvv-6 MMLU [sociology]: 22.59% ± 1.81% (σ=2.93%)
abs-bvv-6 MMLU [us_foreign_policy]: 17.80% ± 1.70% (σ=2.75%)
abs-bvv-6 MMLU [high_school_macroeconomics]: 27.26% ± 1.19% (σ=1.92%)
abs-bvv-6 MMLU [computer_security]: 16.80% ± 1.35% (σ=2.18%)
abs-bvv-6 MMLU [moral_scenarios]: 23.60% ± 0.33% (σ=0.54%)
abs-bvv-6 MMLU [moral_disputes]: 21.45% ± 1.08% (σ=1.74%)
abs-bvv-6 MMLU [electrical_engineering]: 22.69% ± 0.70% (σ=1.13%)
abs-bvv-6 MMLU [astronomy]: 21.78% ± 1.33% (σ=2.15%)
abs-bvv-6 MMLU [college_biology]: 23.12% ± 2.13% (σ=3.43%)
abs-bvv-6 MMLU: 21.63% ± 0.14% (σ=0.22%)
abs-bvv-6 ARC-e: 23.42% ± 0.79% (σ=1.28%)
abs-bvv-6 ARC-c: 25.62% ± 1.19% (σ=1.92%)
abs-bvv-6 C-SENSE: 19.51% ± 0.56% (σ=0.90%)
abs-bvv-6 SQUAD: 5.55% ± 0.65% (σ=1.05%)