abs-bvv-3 Total parameters:     1.7B
abs-bvv-3 MMLU [high_school_european_history]: 15.03% ± 1.31% (σ=2.11%)
abs-bvv-3 MMLU [business_ethics]: 17.00% ± 2.37% (σ=3.82%)
abs-bvv-3 MMLU [clinical_knowledge]: 25.77% ± 1.32% (σ=2.13%)
abs-bvv-3 MMLU [medical_genetics]: 25.90% ± 1.67% (σ=2.70%)
abs-bvv-3 MMLU [high_school_us_history]: 15.10% ± 1.21% (σ=1.95%)
abs-bvv-3 MMLU [high_school_physics]: 20.26% ± 1.23% (σ=1.99%)
abs-bvv-3 MMLU [high_school_world_history]: 14.09% ± 0.97% (σ=1.56%)
abs-bvv-3 MMLU [virology]: 21.93% ± 0.79% (σ=1.27%)
abs-bvv-3 MMLU [high_school_microeconomics]: 28.49% ± 1.07% (σ=1.73%)
abs-bvv-3 MMLU [econometrics]: 25.09% ± 1.81% (σ=2.91%)
abs-bvv-3 MMLU [college_computer_science]: 22.70% ± 1.49% (σ=2.41%)
abs-bvv-3 MMLU [high_school_biology]: 24.68% ± 0.92% (σ=1.48%)
abs-bvv-3 MMLU [abstract_algebra]: 9.30% ± 0.92% (σ=1.49%)
abs-bvv-3 MMLU [professional_accounting]: 18.72% ± 1.01% (σ=1.63%)
abs-bvv-3 MMLU [philosophy]: 22.70% ± 0.78% (σ=1.26%)
abs-bvv-3 MMLU [professional_medicine]: 29.34% ± 1.43% (σ=2.30%)
abs-bvv-3 MMLU [nutrition]: 21.93% ± 0.69% (σ=1.11%)
abs-bvv-3 MMLU [global_facts]: 16.00% ± 1.21% (σ=1.95%)
abs-bvv-3 MMLU [machine_learning]: 14.91% ± 1.02% (σ=1.65%)
abs-bvv-3 MMLU [security_studies]: 18.82% ± 1.20% (σ=1.94%)
abs-bvv-3 MMLU [public_relations]: 22.73% ± 1.15% (σ=1.86%)
abs-bvv-3 MMLU [professional_psychology]: 19.20% ± 0.50% (σ=0.80%)
abs-bvv-3 MMLU [prehistory]: 20.15% ± 1.14% (σ=1.83%)
abs-bvv-3 MMLU [anatomy]: 22.30% ± 1.76% (σ=2.84%)
abs-bvv-3 MMLU [human_sexuality]: 26.64% ± 1.05% (σ=1.69%)
abs-bvv-3 MMLU [college_medicine]: 26.13% ± 0.96% (σ=1.55%)
abs-bvv-3 MMLU [high_school_government_and_politics]: 22.07% ± 1.03% (σ=1.66%)
abs-bvv-3 MMLU [college_chemistry]: 26.40% ± 2.59% (σ=4.18%)
abs-bvv-3 MMLU [logical_fallacies]: 16.93% ± 1.83% (σ=2.94%)
abs-bvv-3 MMLU [high_school_geography]: 27.98% ± 1.56% (σ=2.52%)
abs-bvv-3 MMLU [elementary_mathematics]: 20.71% ± 0.45% (σ=0.73%)
abs-bvv-3 MMLU [human_aging]: 18.88% ± 1.57% (σ=2.53%)
abs-bvv-3 MMLU [college_mathematics]: 21.30% ± 1.57% (σ=2.53%)
abs-bvv-3 MMLU [high_school_psychology]: 26.20% ± 0.93% (σ=1.50%)
abs-bvv-3 MMLU [formal_logic]: 21.11% ± 2.07% (σ=3.33%)
abs-bvv-3 MMLU [high_school_statistics]: 22.22% ± 1.08% (σ=1.74%)
abs-bvv-3 MMLU [international_law]: 8.18% ± 1.44% (σ=2.32%)
abs-bvv-3 MMLU [high_school_mathematics]: 21.15% ± 1.38% (σ=2.22%)
abs-bvv-3 MMLU [high_school_computer_science]: 16.80% ± 1.54% (σ=2.48%)
abs-bvv-3 MMLU [conceptual_physics]: 26.85% ± 0.87% (σ=1.40%)
abs-bvv-3 MMLU [miscellaneous]: 19.57% ± 0.68% (σ=1.09%)
abs-bvv-3 MMLU [high_school_chemistry]: 23.15% ± 1.47% (σ=2.37%)
abs-bvv-3 MMLU [marketing]: 21.07% ± 1.11% (σ=1.79%)
abs-bvv-3 MMLU [professional_law]: 12.84% ± 0.39% (σ=0.64%)
abs-bvv-3 MMLU [management]: 28.64% ± 1.35% (σ=2.18%)
abs-bvv-3 MMLU [college_physics]: 21.76% ± 2.65% (σ=4.27%)
abs-bvv-3 MMLU [jurisprudence]: 23.24% ± 2.27% (σ=3.67%)
abs-bvv-3 MMLU [world_religions]: 16.02% ± 0.88% (σ=1.41%)
abs-bvv-3 MMLU [sociology]: 21.84% ± 1.03% (σ=1.66%)
abs-bvv-3 MMLU [us_foreign_policy]: 19.90% ± 1.40% (σ=2.26%)
abs-bvv-3 MMLU [high_school_macroeconomics]: 27.95% ± 1.09% (σ=1.76%)
abs-bvv-3 MMLU [computer_security]: 17.00% ± 1.73% (σ=2.79%)
abs-bvv-3 MMLU [moral_scenarios]: 22.93% ± 0.70% (σ=1.13%)
abs-bvv-3 MMLU [moral_disputes]: 21.13% ± 0.71% (σ=1.14%)
abs-bvv-3 MMLU [electrical_engineering]: 21.24% ± 1.61% (σ=2.60%)
abs-bvv-3 MMLU [astronomy]: 20.59% ± 1.34% (σ=2.16%)
abs-bvv-3 MMLU [college_biology]: 20.07% ± 1.81% (σ=2.92%)
abs-bvv-3 MMLU: 20.63% ± 0.14% (σ=0.23%)
abs-bvv-3 ARC-e: 21.81% ± 0.58% (σ=0.94%)
abs-bvv-3 ARC-c: 24.78% ± 1.43% (σ=2.31%)
abs-bvv-3 C-SENSE: 19.50% ± 0.75% (σ=1.22%)
abs-bvv-3 SQUAD: 3.75% ± 0.75% (σ=1.21%)
