模型名称,安全影响分类,实验结果为true的数据数,实验结果为false的数据数,成功率
claude-4,UDM,16,6,72.73%
claude-4,UDR,14,9,60.87%
claude-4,DoS,16,11,59.26%
claude-4,所有类型汇总,46,26,63.89%
deepseek,UDM,9,13,40.91%
deepseek,UDR,7,16,30.43%
deepseek,DoS,8,19,29.63%
deepseek,所有类型汇总,24,48,33.33%
gemini-2.5,UDM,14,8,63.64%
gemini-2.5,UDR,2,21,8.7%
gemini-2.5,DoS,13,14,48.15%
gemini-2.5,所有类型汇总,29,43,40.28%
gpt-4.1,UDM,16,6,72.73%
gpt-4.1,UDR,13,10,56.52%
gpt-4.1,DoS,11,16,40.74%
gpt-4.1,所有类型汇总,40,32,55.56%
qwen,UDM,12,10,54.55%
qwen,UDR,6,17,26.09%
qwen,DoS,15,12,55.56%
qwen,所有类型汇总,33,39,45.83%
