模型名称,安全影响分类,实验结果为true的数据数,实验结果为false的数据数,成功率
claude-4,UDM,7,13,35.0%
claude-4,UDR,11,9,55.0%
claude-4,DoS,11,11,50.0%
claude-4,所有类型汇总,29,33,46.77%
deepseek,UDM,8,12,40.0%
deepseek,UDR,5,15,25.0%
deepseek,DoS,9,13,40.91%
deepseek,所有类型汇总,22,40,35.48%
gemini-2.5,UDM,12,8,60.0%
gemini-2.5,UDR,0,20,0.0%
gemini-2.5,DoS,9,13,40.91%
gemini-2.5,所有类型汇总,21,41,33.87%
gpt-4.1,UDM,13,7,65.0%
gpt-4.1,UDR,3,16,15.79%
gpt-4.1,DoS,8,14,36.36%
gpt-4.1,所有类型汇总,24,37,39.34%
qwen,UDM,8,12,40.0%
qwen,UDR,4,16,20.0%
qwen,DoS,11,11,50.0%
qwen,所有类型汇总,23,39,37.1%
