Model,SOP,structural_alignment,property_fidelity,semantic_fidelity,code_bonus,code_compliance,norm_counterexample_traces,norm_minor_issues,errors,execution_compliance,Final_score
Agent,eval_model_01,6.0,8.0,7.33,-0.33,0.5,0.17,0.0,No,0.86,0.72
Agent_Debuged,eval_model_01,6.0,6.33,6.67,0.5,0.6,0.0,0.0,No,1.0,0.84
Qwen,eval_model_01,5.33,5.0,6.0,0.08,0.45,1.0,0.0,No,0.2,0.3
Agent,eval_model_02,4.67,4.0,5.67,0.5,0.48,0.0,1.0,No,0.8,0.67
Agent_Debuged,eval_model_02,5.33,5.0,5.0,0.62,0.53,0.0,0.0,No,1.0,0.81
Qwen,eval_model_02,5.67,5.33,5.33,0.0,0.43,1.0,0.33,No,0.13,0.25
Agent,eval_model_03,3.33,4.0,2.33,0.08,0.27,0.38,0.67,No,0.56,0.44
Agent_Debuged,eval_model_03,2.67,4.67,3.0,0.17,0.31,0.0,0.0,No,1.0,0.72
Qwen,eval_model_03,6.0,8.33,7.0,0.5,0.67,1.0,1.0,No,0.0,0.27
Agent,eval_model_04,6.33,8.0,6.67,0.25,0.61,1.0,0.0,No,0.2,0.36
Agent_Debuged,eval_model_04,7.67,8.0,7.33,0.29,0.67,0.0,1.0,No,0.8,0.75
Qwen,eval_model_04,5.33,3.67,4.0,0.0,0.34,0.0,0.5,No,0.9,0.68
Agent,eval_model_05,4.33,5.0,5.33,0.25,0.44,0.0,1.0,No,0.8,0.66
Agent_Debuged,eval_model_05,4.33,5.0,5.0,0.88,0.56,0.0,0.0,No,1.0,0.82
Qwen,eval_model_05,3.33,3.67,3.67,0.25,0.34,0.0,0.25,Yes,0.0,0.14
Agent,eval_model_06,5.67,8.0,5.33,0.5,0.6,1.0,1.0,No,0.0,0.24
Agent_Debuged,eval_model_06,4.0,6.67,5.0,0.49,0.51,0.0,1.0,No,0.8,0.68
Qwen,eval_model_06,6.67,6.67,8.33,0.0,0.58,0.0,0.0,No,1.0,0.83
Agent,eval_model_07,8.0,8.0,7.0,0.08,0.63,1.0,1.0,No,0.0,0.25
Agent_Debuged,eval_model_07,8.33,5.67,6.67,0.0,0.55,0.0,0.5,No,0.9,0.76
Qwen,eval_model_07,7.33,8.67,8.33,0.0,0.65,0.0,0.0,No,1.0,0.86
Agent,eval_model_08,5.0,6.33,4.33,0.25,0.47,1.0,1.0,No,0.0,0.19
Agent_Debuged,eval_model_08,6.0,7.0,6.0,0.46,0.6,0.0,0.0,No,1.0,0.84
Qwen,eval_model_08,6.67,9.0,5.33,-0.12,0.53,1.0,0.0,No,0.2,0.33
Agent,eval_model_09,5.0,4.67,4.33,0.5,0.48,0.88,0.0,No,0.3,0.37
Agent_Debuged,eval_model_09,4.0,6.33,5.67,0.44,0.51,0.0,1.0,No,0.8,0.68
Qwen,eval_model_09,4.33,8.0,5.67,0.19,0.52,1.0,0.0,No,0.2,0.33
Agent,eval_model_10,6.33,8.0,6.67,0.43,0.65,1.0,0.29,No,0.14,0.34
Agent_Debuged,eval_model_10,7.33,8.67,7.0,-0.25,0.57,0.0,1.0,No,0.8,0.71
Qwen,eval_model_10,6.33,6.33,6.67,0.33,0.58,0.0,0.0,Yes,0.0,0.23
