max_bvv_moe Total parameters:     0.8B
max_bvv_moe MMLU [high_school_european_history]: 14.24% ± 1.21% (σ=1.96%)
max_bvv_moe MMLU [business_ethics]: 20.40% ± 2.66% (σ=4.29%)
max_bvv_moe MMLU [clinical_knowledge]: 25.40% ± 1.49% (σ=2.41%)
max_bvv_moe MMLU [medical_genetics]: 23.10% ± 2.12% (σ=3.42%)
max_bvv_moe MMLU [high_school_us_history]: 16.03% ± 1.57% (σ=2.53%)
max_bvv_moe MMLU [high_school_physics]: 24.11% ± 1.71% (σ=2.77%)
max_bvv_moe MMLU [high_school_world_history]: 14.68% ± 0.51% (σ=0.82%)
max_bvv_moe MMLU [virology]: 24.76% ± 1.91% (σ=3.09%)
max_bvv_moe MMLU [high_school_microeconomics]: 29.33% ± 1.47% (σ=2.38%)
max_bvv_moe MMLU [econometrics]: 21.14% ± 1.66% (σ=2.67%)
max_bvv_moe MMLU [college_computer_science]: 19.30% ± 2.18% (σ=3.52%)
max_bvv_moe MMLU [high_school_biology]: 27.68% ± 1.20% (σ=1.93%)
max_bvv_moe MMLU [abstract_algebra]: 22.20% ± 1.77% (σ=2.86%)
max_bvv_moe MMLU [professional_accounting]: 21.49% ± 1.45% (σ=2.34%)
max_bvv_moe MMLU [philosophy]: 23.60% ± 0.96% (σ=1.55%)
max_bvv_moe MMLU [professional_medicine]: 31.36% ± 1.42% (σ=2.28%)
max_bvv_moe MMLU [nutrition]: 23.89% ± 0.98% (σ=1.58%)
max_bvv_moe MMLU [global_facts]: 22.70% ± 1.14% (σ=1.85%)
max_bvv_moe MMLU [machine_learning]: 16.25% ± 1.85% (σ=2.98%)
max_bvv_moe MMLU [security_studies]: 21.84% ± 1.37% (σ=2.22%)
max_bvv_moe MMLU [public_relations]: 24.55% ± 2.47% (σ=3.98%)
max_bvv_moe MMLU [professional_psychology]: 22.42% ± 0.84% (σ=1.36%)
max_bvv_moe MMLU [prehistory]: 22.19% ± 0.92% (σ=1.48%)
max_bvv_moe MMLU [anatomy]: 22.37% ± 2.17% (σ=3.50%)
max_bvv_moe MMLU [human_sexuality]: 25.19% ± 0.92% (σ=1.49%)
max_bvv_moe MMLU [college_medicine]: 26.13% ± 1.75% (σ=2.82%)
max_bvv_moe MMLU [high_school_government_and_politics]: 22.44% ± 1.33% (σ=2.15%)
max_bvv_moe MMLU [college_chemistry]: 28.50% ± 2.32% (σ=3.75%)
max_bvv_moe MMLU [logical_fallacies]: 20.98% ± 1.32% (σ=2.12%)
max_bvv_moe MMLU [high_school_geography]: 25.10% ± 1.74% (σ=2.81%)
max_bvv_moe MMLU [elementary_mathematics]: 21.80% ± 0.74% (σ=1.20%)
max_bvv_moe MMLU [human_aging]: 21.03% ± 1.36% (σ=2.20%)
max_bvv_moe MMLU [college_mathematics]: 22.60% ± 2.15% (σ=3.47%)
max_bvv_moe MMLU [high_school_psychology]: 26.44% ± 1.06% (σ=1.71%)
max_bvv_moe MMLU [formal_logic]: 21.90% ± 2.28% (σ=3.67%)
max_bvv_moe MMLU [high_school_statistics]: 26.16% ± 1.46% (σ=2.36%)
max_bvv_moe MMLU [international_law]: 14.21% ± 1.07% (σ=1.73%)
max_bvv_moe MMLU [high_school_mathematics]: 22.93% ± 0.77% (σ=1.24%)
max_bvv_moe MMLU [high_school_computer_science]: 18.30% ± 1.71% (σ=2.76%)
max_bvv_moe MMLU [conceptual_physics]: 23.57% ± 1.04% (σ=1.67%)
max_bvv_moe MMLU [miscellaneous]: 21.40% ± 0.74% (σ=1.20%)
max_bvv_moe MMLU [high_school_chemistry]: 25.67% ± 1.89% (σ=3.05%)
max_bvv_moe MMLU [marketing]: 22.35% ± 1.32% (σ=2.13%)
max_bvv_moe MMLU [professional_law]: 19.60% ± 0.29% (σ=0.46%)
max_bvv_moe MMLU [management]: 25.05% ± 1.49% (σ=2.41%)
max_bvv_moe MMLU [college_physics]: 25.78% ± 0.94% (σ=1.52%)
max_bvv_moe MMLU [jurisprudence]: 22.41% ± 1.23% (σ=1.98%)
max_bvv_moe MMLU [world_religions]: 17.31% ± 1.29% (σ=2.08%)
max_bvv_moe MMLU [sociology]: 21.64% ± 1.19% (σ=1.92%)
max_bvv_moe MMLU [us_foreign_policy]: 18.70% ± 2.89% (σ=4.67%)
max_bvv_moe MMLU [high_school_macroeconomics]: 27.82% ± 0.76% (σ=1.22%)
max_bvv_moe MMLU [computer_security]: 17.80% ± 1.64% (σ=2.64%)
max_bvv_moe MMLU [moral_scenarios]: 21.71% ± 0.63% (σ=1.02%)
max_bvv_moe MMLU [moral_disputes]: 20.87% ± 0.97% (σ=1.57%)
max_bvv_moe MMLU [electrical_engineering]: 22.83% ± 1.33% (σ=2.15%)
max_bvv_moe MMLU [astronomy]: 25.53% ± 1.19% (σ=1.92%)
max_bvv_moe MMLU [college_biology]: 23.40% ± 2.17% (σ=3.50%)
max_bvv_moe MMLU: 22.37% ± 0.17% (σ=0.27%)
max_bvv_moe ARC-e: 21.39% ± 0.68% (σ=1.10%)
max_bvv_moe ARC-c: 25.05% ± 0.87% (σ=1.41%)
max_bvv_moe C-SENSE: 20.12% ± 0.69% (σ=1.11%)
max_bvv_moe SQUAD: 18.40% ± 0.98% (σ=1.59%)
max_bvv_moe BLEU [en-ru]: 5.02% ± 0.46% (σ=0.75%)
max_bvv_moe BLEU [ru-en]: 5.04% ± 0.27% (σ=0.44%)
max_bvv_moe BLEU [en-zh]: 1.34% ± 0.11% (σ=0.18%)
max_bvv_moe BLEU [zh-en]: 3.35% ± 0.21% (σ=0.34%)
