abs-bvv-2 Total parameters:     1.5B
abs-bvv-2 MMLU [high_school_european_history]: 14.42% ± 1.57% (σ=2.54%)
abs-bvv-2 MMLU [business_ethics]: 18.00% ± 2.51% (σ=4.05%)
abs-bvv-2 MMLU [clinical_knowledge]: 26.15% ± 1.05% (σ=1.69%)
abs-bvv-2 MMLU [medical_genetics]: 22.80% ± 1.49% (σ=2.40%)
abs-bvv-2 MMLU [high_school_us_history]: 15.59% ± 0.93% (σ=1.50%)
abs-bvv-2 MMLU [high_school_physics]: 21.59% ± 1.64% (σ=2.65%)
abs-bvv-2 MMLU [high_school_world_history]: 12.83% ± 0.94% (σ=1.52%)
abs-bvv-2 MMLU [virology]: 19.76% ± 1.95% (σ=3.14%)
abs-bvv-2 MMLU [high_school_microeconomics]: 25.55% ± 1.18% (σ=1.91%)
abs-bvv-2 MMLU [econometrics]: 23.60% ± 1.15% (σ=1.86%)
abs-bvv-2 MMLU [college_computer_science]: 20.90% ± 1.50% (σ=2.43%)
abs-bvv-2 MMLU [high_school_biology]: 22.81% ± 0.63% (σ=1.02%)
abs-bvv-2 MMLU [abstract_algebra]: 15.50% ± 2.06% (σ=3.32%)
abs-bvv-2 MMLU [professional_accounting]: 18.26% ± 0.89% (σ=1.43%)
abs-bvv-2 MMLU [philosophy]: 21.25% ± 1.29% (σ=2.08%)
abs-bvv-2 MMLU [professional_medicine]: 31.14% ± 2.00% (σ=3.23%)
abs-bvv-2 MMLU [nutrition]: 21.99% ± 1.27% (σ=2.05%)
abs-bvv-2 MMLU [global_facts]: 18.70% ± 1.86% (σ=3.00%)
abs-bvv-2 MMLU [machine_learning]: 16.96% ± 2.17% (σ=3.50%)
abs-bvv-2 MMLU [security_studies]: 16.00% ± 1.32% (σ=2.13%)
abs-bvv-2 MMLU [public_relations]: 19.45% ± 2.19% (σ=3.53%)
abs-bvv-2 MMLU [professional_psychology]: 17.29% ± 0.66% (σ=1.07%)
abs-bvv-2 MMLU [prehistory]: 19.38% ± 0.51% (σ=0.83%)
abs-bvv-2 MMLU [anatomy]: 22.89% ± 1.71% (σ=2.76%)
abs-bvv-2 MMLU [human_sexuality]: 23.97% ± 2.05% (σ=3.31%)
abs-bvv-2 MMLU [college_medicine]: 22.95% ± 1.65% (σ=2.66%)
abs-bvv-2 MMLU [high_school_government_and_politics]: 22.95% ± 1.92% (σ=3.09%)
abs-bvv-2 MMLU [college_chemistry]: 25.80% ± 1.92% (σ=3.09%)
abs-bvv-2 MMLU [logical_fallacies]: 17.61% ± 1.58% (σ=2.55%)
abs-bvv-2 MMLU [high_school_geography]: 25.15% ± 2.30% (σ=3.72%)
abs-bvv-2 MMLU [elementary_mathematics]: 19.97% ± 0.93% (σ=1.51%)
abs-bvv-2 MMLU [human_aging]: 16.95% ± 1.02% (σ=1.64%)
abs-bvv-2 MMLU [college_mathematics]: 23.30% ± 1.21% (σ=1.95%)
abs-bvv-2 MMLU [high_school_psychology]: 25.19% ± 0.95% (σ=1.53%)
abs-bvv-2 MMLU [formal_logic]: 20.56% ± 1.95% (σ=3.14%)
abs-bvv-2 MMLU [high_school_statistics]: 18.75% ± 1.45% (σ=2.34%)
abs-bvv-2 MMLU [international_law]: 8.68% ± 1.45% (σ=2.34%)
abs-bvv-2 MMLU [high_school_mathematics]: 20.22% ± 0.77% (σ=1.24%)
abs-bvv-2 MMLU [high_school_computer_science]: 14.50% ± 1.31% (σ=2.11%)
abs-bvv-2 MMLU [conceptual_physics]: 23.62% ± 0.72% (σ=1.16%)
abs-bvv-2 MMLU [miscellaneous]: 18.25% ± 0.36% (σ=0.59%)
abs-bvv-2 MMLU [high_school_chemistry]: 21.77% ± 1.73% (σ=2.78%)
abs-bvv-2 MMLU [marketing]: 19.02% ± 1.66% (σ=2.68%)
abs-bvv-2 MMLU [professional_law]: 10.31% ± 0.36% (σ=0.59%)
abs-bvv-2 MMLU [management]: 28.06% ± 2.41% (σ=3.89%)
abs-bvv-2 MMLU [college_physics]: 23.53% ± 2.16% (σ=3.48%)
abs-bvv-2 MMLU [jurisprudence]: 19.35% ± 2.11% (σ=3.40%)
abs-bvv-2 MMLU [world_religions]: 15.20% ± 0.92% (σ=1.48%)
abs-bvv-2 MMLU [sociology]: 19.35% ± 0.61% (σ=0.98%)
abs-bvv-2 MMLU [us_foreign_policy]: 19.60% ± 1.98% (σ=3.20%)
abs-bvv-2 MMLU [high_school_macroeconomics]: 26.64% ± 1.13% (σ=1.82%)
abs-bvv-2 MMLU [computer_security]: 16.90% ± 1.22% (σ=1.97%)
abs-bvv-2 MMLU [moral_scenarios]: 19.03% ± 0.71% (σ=1.14%)
abs-bvv-2 MMLU [moral_disputes]: 19.36% ± 1.02% (σ=1.65%)
abs-bvv-2 MMLU [electrical_engineering]: 18.90% ± 1.82% (σ=2.93%)
abs-bvv-2 MMLU [astronomy]: 19.87% ± 1.77% (σ=2.85%)
abs-bvv-2 MMLU [college_biology]: 16.88% ± 1.83% (σ=2.95%)
abs-bvv-2 MMLU: 19.43% ± 0.14% (σ=0.23%)
abs-bvv-2 ARC-e: 21.04% ± 0.80% (σ=1.29%)
abs-bvv-2 ARC-c: 21.71% ± 0.66% (σ=1.07%)
abs-bvv-2 C-SENSE: 19.66% ± 0.68% (σ=1.10%)
abs-bvv-2 SQUAD: 0.82% ± 0.40% (σ=0.64%)
