pro_bvv_unfrozen Total parameters:     0.2B
pro_bvv_unfrozen MMLU [high_school_european_history]: 7.82% ± 0.80% (σ=1.28%)
pro_bvv_unfrozen MMLU [business_ethics]: 15.30% ± 1.92% (σ=3.10%)
pro_bvv_unfrozen MMLU [clinical_knowledge]: 17.77% ± 0.72% (σ=1.16%)
pro_bvv_unfrozen MMLU [medical_genetics]: 18.10% ± 1.87% (σ=3.01%)
pro_bvv_unfrozen MMLU [high_school_us_history]: 7.35% ± 0.76% (σ=1.22%)
pro_bvv_unfrozen MMLU [high_school_physics]: 13.51% ± 1.01% (σ=1.63%)
pro_bvv_unfrozen MMLU [high_school_world_history]: 9.87% ± 0.99% (σ=1.60%)
pro_bvv_unfrozen MMLU [virology]: 16.39% ± 1.22% (σ=1.98%)
pro_bvv_unfrozen MMLU [high_school_microeconomics]: 17.14% ± 1.33% (σ=2.14%)
pro_bvv_unfrozen MMLU [econometrics]: 16.23% ± 2.21% (σ=3.56%)
pro_bvv_unfrozen MMLU [college_computer_science]: 13.20% ± 3.10% (σ=5.00%)
pro_bvv_unfrozen MMLU [high_school_biology]: 16.29% ± 1.78% (σ=2.86%)
pro_bvv_unfrozen MMLU [abstract_algebra]: 12.30% ± 2.87% (σ=4.63%)
pro_bvv_unfrozen MMLU [professional_accounting]: 14.47% ± 1.31% (σ=2.12%)
pro_bvv_unfrozen MMLU [philosophy]: 17.11% ± 1.02% (σ=1.64%)
pro_bvv_unfrozen MMLU [professional_medicine]: 11.47% ± 0.88% (σ=1.42%)
pro_bvv_unfrozen MMLU [nutrition]: 15.72% ± 1.17% (σ=1.89%)
pro_bvv_unfrozen MMLU [global_facts]: 9.60% ± 1.82% (σ=2.94%)
pro_bvv_unfrozen MMLU [machine_learning]: 16.96% ± 1.98% (σ=3.19%)
pro_bvv_unfrozen MMLU [security_studies]: 11.31% ± 1.17% (σ=1.89%)
pro_bvv_unfrozen MMLU [public_relations]: 15.27% ± 1.76% (σ=2.84%)
pro_bvv_unfrozen MMLU [professional_psychology]: 15.64% ± 1.08% (σ=1.75%)
pro_bvv_unfrozen MMLU [prehistory]: 15.31% ± 0.97% (σ=1.57%)
pro_bvv_unfrozen MMLU [anatomy]: 15.56% ± 2.07% (σ=3.35%)
pro_bvv_unfrozen MMLU [human_sexuality]: 15.42% ± 2.18% (σ=3.51%)
pro_bvv_unfrozen MMLU [college_medicine]: 15.38% ± 1.19% (σ=1.92%)
pro_bvv_unfrozen MMLU [high_school_government_and_politics]: 12.90% ± 0.99% (σ=1.60%)
pro_bvv_unfrozen MMLU [college_chemistry]: 11.90% ± 1.76% (σ=2.84%)
pro_bvv_unfrozen MMLU [logical_fallacies]: 16.26% ± 1.60% (σ=2.58%)
pro_bvv_unfrozen MMLU [high_school_geography]: 14.95% ± 1.61% (σ=2.60%)
pro_bvv_unfrozen MMLU [elementary_mathematics]: 11.72% ± 1.04% (σ=1.68%)
pro_bvv_unfrozen MMLU [human_aging]: 17.40% ± 1.05% (σ=1.70%)
pro_bvv_unfrozen MMLU [college_mathematics]: 12.00% ± 1.94% (σ=3.13%)
pro_bvv_unfrozen MMLU [high_school_psychology]: 15.78% ± 0.84% (σ=1.36%)
pro_bvv_unfrozen MMLU [formal_logic]: 16.03% ± 2.07% (σ=3.34%)
pro_bvv_unfrozen MMLU [high_school_statistics]: 11.30% ± 1.50% (σ=2.43%)
pro_bvv_unfrozen MMLU [international_law]: 11.32% ± 1.72% (σ=2.77%)
pro_bvv_unfrozen MMLU [high_school_mathematics]: 10.22% ± 1.06% (σ=1.71%)
pro_bvv_unfrozen MMLU [high_school_computer_science]: 13.70% ± 2.70% (σ=4.36%)
pro_bvv_unfrozen MMLU [conceptual_physics]: 20.85% ± 1.55% (σ=2.50%)
pro_bvv_unfrozen MMLU [miscellaneous]: 13.81% ± 0.70% (σ=1.13%)
pro_bvv_unfrozen MMLU [high_school_chemistry]: 13.00% ± 1.02% (σ=1.65%)
pro_bvv_unfrozen MMLU [marketing]: 19.74% ± 1.66% (σ=2.67%)
pro_bvv_unfrozen MMLU [professional_law]: 9.99% ± 0.50% (σ=0.80%)
pro_bvv_unfrozen MMLU [management]: 13.59% ± 2.36% (σ=3.81%)
pro_bvv_unfrozen MMLU [college_physics]: 14.41% ± 1.05% (σ=1.70%)
pro_bvv_unfrozen MMLU [jurisprudence]: 16.20% ± 1.36% (σ=2.20%)
pro_bvv_unfrozen MMLU [world_religions]: 13.16% ± 1.41% (σ=2.27%)
pro_bvv_unfrozen MMLU [sociology]: 13.38% ± 2.27% (σ=3.67%)
pro_bvv_unfrozen MMLU [us_foreign_policy]: 14.40% ± 1.57% (σ=2.54%)
pro_bvv_unfrozen MMLU [high_school_macroeconomics]: 16.00% ± 1.09% (σ=1.75%)
pro_bvv_unfrozen MMLU [computer_security]: 15.40% ± 2.37% (σ=3.83%)
pro_bvv_unfrozen MMLU [moral_scenarios]: 10.07% ± 0.64% (σ=1.03%)
pro_bvv_unfrozen MMLU [moral_disputes]: 18.50% ± 0.90% (σ=1.45%)
pro_bvv_unfrozen MMLU [electrical_engineering]: 20.14% ± 2.04% (σ=3.29%)
pro_bvv_unfrozen MMLU [astronomy]: 13.36% ± 1.77% (σ=2.85%)
pro_bvv_unfrozen MMLU [college_biology]: 15.69% ± 2.29% (σ=3.69%)
pro_bvv_unfrozen MMLU: 14.00% ± 0.14% (σ=0.22%)
pro_bvv_unfrozen ARC-e: 24.09% ± 0.78% (σ=1.26%)
pro_bvv_unfrozen ARC-c: 22.24% ± 1.04% (σ=1.67%)
pro_bvv_unfrozen C-SENSE: 19.76% ± 0.52% (σ=0.84%)
pro_bvv_unfrozen SQUAD: 13.28% ± 0.93% (σ=1.49%)
pro_bvv_unfrozen BLEU [en-ru]: 0.68% ± 0.11% (σ=0.17%)
pro_bvv_unfrozen BLEU [ru-en]: 0.81% ± 0.06% (σ=0.10%)
pro_bvv_unfrozen BLEU [en-zh]: 0.16% ± 0.07% (σ=0.11%)
pro_bvv_unfrozen BLEU [zh-en]: 0.42% ± 0.07% (σ=0.11%)
