模型名称,安全影响分类,实验结果为true的数据数,实验结果为false的数据数,成功率
claude-4,UDM,13,6,68.42%
claude-4,UDR,12,12,50.0%
claude-4,DoS,15,11,57.69%
claude-4,所有类型汇总,40,29,57.97%
deepseek,UDM,8,11,42.11%
deepseek,UDR,7,17,29.17%
deepseek,DoS,11,15,42.31%
deepseek,所有类型汇总,26,43,37.68%
gemini-2.5,UDM,13,6,68.42%
gemini-2.5,UDR,2,22,8.33%
gemini-2.5,DoS,18,8,69.23%
gemini-2.5,所有类型汇总,33,36,47.83%
gpt-4.1,UDM,12,7,63.16%
gpt-4.1,UDR,16,8,66.67%
gpt-4.1,DoS,13,13,50.0%
gpt-4.1,所有类型汇总,41,28,59.42%
qwen,UDM,11,8,57.89%
qwen,UDR,3,21,12.5%
qwen,DoS,12,14,46.15%
qwen,所有类型汇总,26,43,37.68%
