best_bvv_unfrozen_ru Total parameters:     0.5B
best_bvv_unfrozen_ru MMLU [high_school_european_history]: 7.52% ± 1.41% (σ=2.27%)
best_bvv_unfrozen_ru MMLU [business_ethics]: 14.90% ± 1.65% (σ=2.66%)
best_bvv_unfrozen_ru MMLU [clinical_knowledge]: 14.79% ± 1.05% (σ=1.69%)
best_bvv_unfrozen_ru MMLU [medical_genetics]: 12.10% ± 2.03% (σ=3.27%)
best_bvv_unfrozen_ru MMLU [high_school_us_history]: 7.40% ± 0.77% (σ=1.25%)
best_bvv_unfrozen_ru MMLU [high_school_physics]: 12.45% ± 1.16% (σ=1.87%)
best_bvv_unfrozen_ru MMLU [high_school_world_history]: 9.49% ± 1.11% (σ=1.79%)
best_bvv_unfrozen_ru MMLU [virology]: 13.31% ± 1.83% (σ=2.96%)
best_bvv_unfrozen_ru MMLU [high_school_microeconomics]: 15.13% ± 1.11% (σ=1.79%)
best_bvv_unfrozen_ru MMLU [econometrics]: 12.81% ± 0.88% (σ=1.43%)
best_bvv_unfrozen_ru MMLU [college_computer_science]: 14.80% ± 1.29% (σ=2.09%)
best_bvv_unfrozen_ru MMLU [high_school_biology]: 13.00% ± 0.89% (σ=1.43%)
best_bvv_unfrozen_ru MMLU [abstract_algebra]: 13.10% ± 0.98% (σ=1.58%)
best_bvv_unfrozen_ru MMLU [professional_accounting]: 14.43% ± 1.19% (σ=1.92%)
best_bvv_unfrozen_ru MMLU [philosophy]: 12.06% ± 0.86% (σ=1.39%)
best_bvv_unfrozen_ru MMLU [professional_medicine]: 11.10% ± 0.94% (σ=1.52%)
best_bvv_unfrozen_ru MMLU [nutrition]: 13.63% ± 1.17% (σ=1.88%)
best_bvv_unfrozen_ru MMLU [global_facts]: 5.10% ± 0.94% (σ=1.51%)
best_bvv_unfrozen_ru MMLU [machine_learning]: 13.84% ± 1.91% (σ=3.07%)
best_bvv_unfrozen_ru MMLU [security_studies]: 7.10% ± 0.67% (σ=1.08%)
best_bvv_unfrozen_ru MMLU [public_relations]: 18.27% ± 1.39% (σ=2.24%)
best_bvv_unfrozen_ru MMLU [professional_psychology]: 12.86% ± 0.59% (σ=0.95%)
best_bvv_unfrozen_ru MMLU [prehistory]: 12.16% ± 1.06% (σ=1.71%)
best_bvv_unfrozen_ru MMLU [anatomy]: 12.67% ± 1.52% (σ=2.44%)
best_bvv_unfrozen_ru MMLU [human_sexuality]: 11.53% ± 1.44% (σ=2.33%)
best_bvv_unfrozen_ru MMLU [college_medicine]: 13.06% ± 2.29% (σ=3.69%)
best_bvv_unfrozen_ru MMLU [high_school_government_and_politics]: 14.09% ± 1.06% (σ=1.72%)
best_bvv_unfrozen_ru MMLU [college_chemistry]: 13.50% ± 1.22% (σ=1.96%)
best_bvv_unfrozen_ru MMLU [logical_fallacies]: 12.15% ± 1.15% (σ=1.86%)
best_bvv_unfrozen_ru MMLU [high_school_geography]: 14.39% ± 1.42% (σ=2.28%)
best_bvv_unfrozen_ru MMLU [elementary_mathematics]: 13.23% ± 0.81% (σ=1.31%)
best_bvv_unfrozen_ru MMLU [human_aging]: 14.04% ± 1.19% (σ=1.91%)
best_bvv_unfrozen_ru MMLU [college_mathematics]: 11.80% ± 1.10% (σ=1.78%)
best_bvv_unfrozen_ru MMLU [high_school_psychology]: 14.53% ± 0.63% (σ=1.02%)
best_bvv_unfrozen_ru MMLU [formal_logic]: 10.63% ± 1.48% (σ=2.39%)
best_bvv_unfrozen_ru MMLU [high_school_statistics]: 11.44% ± 1.33% (σ=2.15%)
best_bvv_unfrozen_ru MMLU [international_law]: 8.60% ± 1.24% (σ=2.00%)
best_bvv_unfrozen_ru MMLU [high_school_mathematics]: 11.22% ± 0.85% (σ=1.37%)
best_bvv_unfrozen_ru MMLU [high_school_computer_science]: 10.30% ± 1.44% (σ=2.33%)
best_bvv_unfrozen_ru MMLU [conceptual_physics]: 16.68% ± 1.35% (σ=2.18%)
best_bvv_unfrozen_ru MMLU [miscellaneous]: 8.57% ± 0.60% (σ=0.97%)
best_bvv_unfrozen_ru MMLU [high_school_chemistry]: 11.38% ± 1.08% (σ=1.74%)
best_bvv_unfrozen_ru MMLU [marketing]: 17.99% ± 0.75% (σ=1.22%)
best_bvv_unfrozen_ru MMLU [professional_law]: 6.08% ± 0.24% (σ=0.39%)
best_bvv_unfrozen_ru MMLU [management]: 13.98% ± 1.66% (σ=2.68%)
best_bvv_unfrozen_ru MMLU [college_physics]: 12.45% ± 2.05% (σ=3.31%)
best_bvv_unfrozen_ru MMLU [jurisprudence]: 9.81% ± 1.97% (σ=3.19%)
best_bvv_unfrozen_ru MMLU [world_religions]: 4.44% ± 0.78% (σ=1.26%)
best_bvv_unfrozen_ru MMLU [sociology]: 10.80% ± 1.52% (σ=2.46%)
best_bvv_unfrozen_ru MMLU [us_foreign_policy]: 7.90% ± 1.43% (σ=2.30%)
best_bvv_unfrozen_ru MMLU [high_school_macroeconomics]: 19.31% ± 1.02% (σ=1.65%)
best_bvv_unfrozen_ru MMLU [computer_security]: 9.80% ± 2.28% (σ=3.68%)
best_bvv_unfrozen_ru MMLU [moral_scenarios]: 9.09% ± 0.40% (σ=0.64%)
best_bvv_unfrozen_ru MMLU [moral_disputes]: 8.58% ± 0.63% (σ=1.02%)
best_bvv_unfrozen_ru MMLU [electrical_engineering]: 14.00% ± 1.60% (σ=2.58%)
best_bvv_unfrozen_ru MMLU [astronomy]: 9.93% ± 1.62% (σ=2.61%)
best_bvv_unfrozen_ru MMLU [college_biology]: 13.19% ± 1.05% (σ=1.70%)
best_bvv_unfrozen_ru MMLU: 11.37% ± 0.18% (σ=0.29%)
best_bvv_unfrozen_ru ARC-e: 20.56% ± 0.65% (σ=1.05%)
best_bvv_unfrozen_ru ARC-c: 24.18% ± 0.80% (σ=1.29%)
best_bvv_unfrozen_ru C-SENSE: 18.79% ± 0.91% (σ=1.47%)
best_bvv_unfrozen_ru SQUAD: 13.55% ± 0.77% (σ=1.25%)
best_bvv_unfrozen_ru BLEU [en-ru]: 8.40% ± 0.44% (σ=0.70%)
best_bvv_unfrozen_ru BLEU [ru-en]: 7.96% ± 0.32% (σ=0.52%)
best_bvv_unfrozen_ru BLEU [en-zh]: 0.77% ± 0.14% (σ=0.22%)
best_bvv_unfrozen_ru BLEU [zh-en]: 1.09% ± 0.09% (σ=0.15%)
