模型名称,安全影响分类,实验结果为true的数据数,实验结果为false的数据数,成功率
claude-4,UDM,8,12,40.0%
claude-4,UDR,6,14,30.0%
claude-4,DoS,12,10,54.55%
claude-4,所有类型汇总,26,36,41.94%
deepseek,UDM,3,17,15.0%
deepseek,UDR,4,16,20.0%
deepseek,DoS,7,15,31.82%
deepseek,所有类型汇总,14,48,22.58%
gemini-2.5,UDM,12,8,60.0%
gemini-2.5,UDR,1,19,5.0%
gemini-2.5,DoS,8,14,36.36%
gemini-2.5,所有类型汇总,21,41,33.87%
gpt-4.1,UDM,12,8,60.0%
gpt-4.1,UDR,8,12,40.0%
gpt-4.1,DoS,10,12,45.45%
gpt-4.1,所有类型汇总,30,32,48.39%
qwen,UDM,9,11,45.0%
qwen,UDR,2,18,10.0%
qwen,DoS,11,11,50.0%
qwen,所有类型汇总,22,40,35.48%
