best_bvv_unfrozen_zh Total parameters:     0.5B
best_bvv_unfrozen_zh MMLU [high_school_european_history]: 2.91% ± 1.27% (σ=2.04%)
best_bvv_unfrozen_zh MMLU [business_ethics]: 14.80% ± 1.96% (σ=3.16%)
best_bvv_unfrozen_zh MMLU [clinical_knowledge]: 20.60% ± 1.45% (σ=2.33%)
best_bvv_unfrozen_zh MMLU [medical_genetics]: 17.90% ± 1.60% (σ=2.59%)
best_bvv_unfrozen_zh MMLU [high_school_us_history]: 4.46% ± 0.95% (σ=1.53%)
best_bvv_unfrozen_zh MMLU [high_school_physics]: 14.24% ± 1.22% (σ=1.97%)
best_bvv_unfrozen_zh MMLU [high_school_world_history]: 5.57% ± 0.58% (σ=0.94%)
best_bvv_unfrozen_zh MMLU [virology]: 16.39% ± 1.63% (σ=2.64%)
best_bvv_unfrozen_zh MMLU [high_school_microeconomics]: 18.99% ± 1.36% (σ=2.19%)
best_bvv_unfrozen_zh MMLU [econometrics]: 12.19% ± 0.82% (σ=1.33%)
best_bvv_unfrozen_zh MMLU [college_computer_science]: 11.50% ± 1.60% (σ=2.58%)
best_bvv_unfrozen_zh MMLU [high_school_biology]: 19.97% ± 0.87% (σ=1.40%)
best_bvv_unfrozen_zh MMLU [abstract_algebra]: 19.00% ± 1.62% (σ=2.61%)
best_bvv_unfrozen_zh MMLU [professional_accounting]: 14.54% ± 0.78% (σ=1.26%)
best_bvv_unfrozen_zh MMLU [philosophy]: 19.23% ± 0.96% (σ=1.55%)
best_bvv_unfrozen_zh MMLU [professional_medicine]: 16.14% ± 1.69% (σ=2.73%)
best_bvv_unfrozen_zh MMLU [nutrition]: 17.94% ± 0.58% (σ=0.93%)
best_bvv_unfrozen_zh MMLU [global_facts]: 7.00% ± 1.04% (σ=1.67%)
best_bvv_unfrozen_zh MMLU [machine_learning]: 12.23% ± 0.93% (σ=1.50%)
best_bvv_unfrozen_zh MMLU [security_studies]: 13.88% ± 0.96% (σ=1.55%)
best_bvv_unfrozen_zh MMLU [public_relations]: 16.73% ± 1.77% (σ=2.85%)
best_bvv_unfrozen_zh MMLU [professional_psychology]: 16.19% ± 0.74% (σ=1.19%)
best_bvv_unfrozen_zh MMLU [prehistory]: 16.76% ± 0.62% (σ=1.01%)
best_bvv_unfrozen_zh MMLU [anatomy]: 18.07% ± 1.85% (σ=2.99%)
best_bvv_unfrozen_zh MMLU [human_sexuality]: 17.48% ± 1.47% (σ=2.38%)
best_bvv_unfrozen_zh MMLU [college_medicine]: 18.38% ± 1.07% (σ=1.73%)
best_bvv_unfrozen_zh MMLU [high_school_government_and_politics]: 12.28% ± 0.93% (σ=1.50%)
best_bvv_unfrozen_zh MMLU [college_chemistry]: 16.60% ± 2.62% (σ=4.22%)
best_bvv_unfrozen_zh MMLU [logical_fallacies]: 12.39% ± 0.72% (σ=1.16%)
best_bvv_unfrozen_zh MMLU [high_school_geography]: 17.37% ± 1.58% (σ=2.55%)
best_bvv_unfrozen_zh MMLU [elementary_mathematics]: 13.02% ± 1.06% (σ=1.71%)
best_bvv_unfrozen_zh MMLU [human_aging]: 18.52% ± 1.30% (σ=2.10%)
best_bvv_unfrozen_zh MMLU [college_mathematics]: 17.40% ± 1.39% (σ=2.24%)
best_bvv_unfrozen_zh MMLU [high_school_psychology]: 19.21% ± 0.99% (σ=1.59%)
best_bvv_unfrozen_zh MMLU [formal_logic]: 13.73% ± 1.23% (σ=1.98%)
best_bvv_unfrozen_zh MMLU [high_school_statistics]: 14.07% ± 1.56% (σ=2.51%)
best_bvv_unfrozen_zh MMLU [international_law]: 5.95% ± 1.25% (σ=2.02%)
best_bvv_unfrozen_zh MMLU [high_school_mathematics]: 13.52% ± 1.14% (σ=1.85%)
best_bvv_unfrozen_zh MMLU [high_school_computer_science]: 11.10% ± 1.16% (σ=1.87%)
best_bvv_unfrozen_zh MMLU [conceptual_physics]: 22.85% ± 1.31% (σ=2.12%)
best_bvv_unfrozen_zh MMLU [miscellaneous]: 11.34% ± 0.70% (σ=1.13%)
best_bvv_unfrozen_zh MMLU [high_school_chemistry]: 16.80% ± 1.42% (σ=2.28%)
best_bvv_unfrozen_zh MMLU [marketing]: 18.80% ± 0.72% (σ=1.16%)
best_bvv_unfrozen_zh MMLU [professional_law]: 9.41% ± 0.29% (σ=0.46%)
best_bvv_unfrozen_zh MMLU [management]: 12.04% ± 2.11% (σ=3.40%)
best_bvv_unfrozen_zh MMLU [college_physics]: 11.57% ± 1.90% (σ=3.06%)
best_bvv_unfrozen_zh MMLU [jurisprudence]: 16.30% ± 1.63% (σ=2.63%)
best_bvv_unfrozen_zh MMLU [world_religions]: 7.72% ± 0.56% (σ=0.90%)
best_bvv_unfrozen_zh MMLU [sociology]: 11.14% ± 1.21% (σ=1.95%)
best_bvv_unfrozen_zh MMLU [us_foreign_policy]: 8.90% ± 1.50% (σ=2.43%)
best_bvv_unfrozen_zh MMLU [high_school_macroeconomics]: 19.90% ± 0.83% (σ=1.34%)
best_bvv_unfrozen_zh MMLU [computer_security]: 11.60% ± 1.71% (σ=2.76%)
best_bvv_unfrozen_zh MMLU [moral_scenarios]: 8.79% ± 0.49% (σ=0.80%)
best_bvv_unfrozen_zh MMLU [moral_disputes]: 16.27% ± 0.72% (σ=1.16%)
best_bvv_unfrozen_zh MMLU [electrical_engineering]: 19.31% ± 1.31% (σ=2.11%)
best_bvv_unfrozen_zh MMLU [astronomy]: 15.20% ± 1.38% (σ=2.23%)
best_bvv_unfrozen_zh MMLU [college_biology]: 17.50% ± 1.87% (σ=3.02%)
best_bvv_unfrozen_zh MMLU: 14.03% ± 0.09% (σ=0.14%)
best_bvv_unfrozen_zh ARC-e: 19.74% ± 0.70% (σ=1.13%)
best_bvv_unfrozen_zh ARC-c: 25.02% ± 0.97% (σ=1.57%)
best_bvv_unfrozen_zh C-SENSE: 18.98% ± 0.56% (σ=0.90%)
best_bvv_unfrozen_zh SQUAD: 13.52% ± 0.75% (σ=1.21%)
best_bvv_unfrozen_zh BLEU [en-ru]: 4.29% ± 0.22% (σ=0.35%)
best_bvv_unfrozen_zh BLEU [ru-en]: 2.90% ± 0.27% (σ=0.44%)
best_bvv_unfrozen_zh BLEU [en-zh]: 1.65% ± 0.32% (σ=0.52%)
best_bvv_unfrozen_zh BLEU [zh-en]: 5.93% ± 0.32% (σ=0.51%)
