abs-bvv-1 Total parameters:     1.3B
abs-bvv-1 MMLU [high_school_european_history]: 14.85% ± 1.52% (σ=2.46%)
abs-bvv-1 MMLU [business_ethics]: 15.00% ± 1.71% (σ=2.76%)
abs-bvv-1 MMLU [clinical_knowledge]: 24.34% ± 1.85% (σ=2.98%)
abs-bvv-1 MMLU [medical_genetics]: 23.10% ± 2.29% (σ=3.70%)
abs-bvv-1 MMLU [high_school_us_history]: 15.69% ± 0.99% (σ=1.60%)
abs-bvv-1 MMLU [high_school_physics]: 20.79% ± 0.76% (σ=1.23%)
abs-bvv-1 MMLU [high_school_world_history]: 13.04% ± 0.79% (σ=1.27%)
abs-bvv-1 MMLU [virology]: 17.41% ± 1.53% (σ=2.46%)
abs-bvv-1 MMLU [high_school_microeconomics]: 24.20% ± 1.02% (σ=1.65%)
abs-bvv-1 MMLU [econometrics]: 20.70% ± 2.95% (σ=4.76%)
abs-bvv-1 MMLU [college_computer_science]: 21.10% ± 1.70% (σ=2.74%)
abs-bvv-1 MMLU [high_school_biology]: 20.13% ± 1.24% (σ=1.99%)
abs-bvv-1 MMLU [abstract_algebra]: 16.10% ± 1.65% (σ=2.66%)
abs-bvv-1 MMLU [professional_accounting]: 15.43% ± 0.88% (σ=1.42%)
abs-bvv-1 MMLU [philosophy]: 20.84% ± 1.16% (σ=1.87%)
abs-bvv-1 MMLU [professional_medicine]: 32.90% ± 0.91% (σ=1.47%)
abs-bvv-1 MMLU [nutrition]: 21.44% ± 1.66% (σ=2.68%)
abs-bvv-1 MMLU [global_facts]: 15.50% ± 1.71% (σ=2.77%)
abs-bvv-1 MMLU [machine_learning]: 15.45% ± 1.90% (σ=3.07%)
abs-bvv-1 MMLU [security_studies]: 16.04% ± 1.06% (σ=1.71%)
abs-bvv-1 MMLU [public_relations]: 18.45% ± 1.90% (σ=3.07%)
abs-bvv-1 MMLU [professional_psychology]: 16.24% ± 0.76% (σ=1.23%)
abs-bvv-1 MMLU [prehistory]: 18.92% ± 1.00% (σ=1.61%)
abs-bvv-1 MMLU [anatomy]: 20.44% ± 0.90% (σ=1.45%)
abs-bvv-1 MMLU [human_sexuality]: 24.27% ± 1.66% (σ=2.68%)
abs-bvv-1 MMLU [college_medicine]: 21.85% ± 1.42% (σ=2.29%)
abs-bvv-1 MMLU [high_school_government_and_politics]: 20.88% ± 1.00% (σ=1.61%)
abs-bvv-1 MMLU [college_chemistry]: 24.10% ± 1.70% (σ=2.74%)
abs-bvv-1 MMLU [logical_fallacies]: 16.13% ± 1.63% (σ=2.63%)
abs-bvv-1 MMLU [high_school_geography]: 26.01% ± 1.32% (σ=2.12%)
abs-bvv-1 MMLU [elementary_mathematics]: 18.60% ± 0.82% (σ=1.32%)
abs-bvv-1 MMLU [human_aging]: 14.66% ± 0.79% (σ=1.27%)
abs-bvv-1 MMLU [college_mathematics]: 21.50% ± 2.29% (σ=3.69%)
abs-bvv-1 MMLU [high_school_psychology]: 23.98% ± 0.53% (σ=0.86%)
abs-bvv-1 MMLU [formal_logic]: 20.24% ± 1.93% (σ=3.12%)
abs-bvv-1 MMLU [high_school_statistics]: 19.63% ± 1.11% (σ=1.80%)
abs-bvv-1 MMLU [international_law]: 6.94% ± 1.10% (σ=1.78%)
abs-bvv-1 MMLU [high_school_mathematics]: 16.89% ± 1.39% (σ=2.24%)
abs-bvv-1 MMLU [high_school_computer_science]: 11.00% ± 1.47% (σ=2.37%)
abs-bvv-1 MMLU [conceptual_physics]: 20.09% ± 1.29% (σ=2.07%)
abs-bvv-1 MMLU [miscellaneous]: 16.70% ± 0.72% (σ=1.16%)
abs-bvv-1 MMLU [high_school_chemistry]: 18.77% ± 1.92% (σ=3.10%)
abs-bvv-1 MMLU [marketing]: 17.52% ± 1.25% (σ=2.02%)
abs-bvv-1 MMLU [professional_law]: 9.99% ± 0.43% (σ=0.69%)
abs-bvv-1 MMLU [management]: 24.17% ± 2.57% (σ=4.15%)
abs-bvv-1 MMLU [college_physics]: 21.37% ± 1.65% (σ=2.66%)
abs-bvv-1 MMLU [jurisprudence]: 17.41% ± 1.33% (σ=2.14%)
abs-bvv-1 MMLU [world_religions]: 15.96% ± 1.12% (σ=1.81%)
abs-bvv-1 MMLU [sociology]: 18.91% ± 1.85% (σ=2.99%)
abs-bvv-1 MMLU [us_foreign_policy]: 18.80% ± 2.03% (σ=3.28%)
abs-bvv-1 MMLU [high_school_macroeconomics]: 24.79% ± 0.76% (σ=1.22%)
abs-bvv-1 MMLU [computer_security]: 13.20% ± 1.61% (σ=2.60%)
abs-bvv-1 MMLU [moral_scenarios]: 17.33% ± 0.51% (σ=0.82%)
abs-bvv-1 MMLU [moral_disputes]: 17.83% ± 1.10% (σ=1.78%)
abs-bvv-1 MMLU [electrical_engineering]: 18.90% ± 1.75% (σ=2.83%)
abs-bvv-1 MMLU [astronomy]: 19.74% ± 0.71% (σ=1.14%)
abs-bvv-1 MMLU [college_biology]: 16.46% ± 1.45% (σ=2.35%)
abs-bvv-1 MMLU: 18.08% ± 0.15% (σ=0.24%)
abs-bvv-1 ARC-e: 19.21% ± 0.48% (σ=0.77%)
abs-bvv-1 ARC-c: 19.83% ± 1.15% (σ=1.85%)
abs-bvv-1 C-SENSE: 19.36% ± 0.55% (σ=0.89%)
abs-bvv-1 SQUAD: 1.21% ± 0.31% (σ=0.51%)
