模型名称,安全影响分类,实验结果为true的数据数,实验结果为false的数据数,成功率
claude-4,UDM,37,24,60.66%
claude-4,UDR,32,35,47.76%
claude-4,DoS,43,32,57.33%
claude-4,所有类型汇总,112,91,55.17%
deepseek,UDM,20,41,32.79%
deepseek,UDR,18,49,26.87%
deepseek,DoS,26,49,34.67%
deepseek,所有类型汇总,64,139,31.53%
gemini-2.5,UDM,39,22,63.93%
gemini-2.5,UDR,5,62,7.46%
gemini-2.5,DoS,39,36,52.0%
gemini-2.5,所有类型汇总,83,120,40.89%
gpt-4.1,UDM,40,21,65.57%
gpt-4.1,UDR,37,30,55.22%
gpt-4.1,DoS,34,41,45.33%
gpt-4.1,所有类型汇总,111,92,54.68%
qwen,UDM,32,29,52.46%
qwen,UDR,11,56,16.42%
qwen,DoS,38,37,50.67%
qwen,所有类型汇总,81,122,39.9%
