模型名称,安全影响分类,实验结果为true的数据数,实验结果为false的数据数,成功率
claude-4,UDM,12,10,54.55%
claude-4,UDR,16,7,69.57%
claude-4,DoS,14,13,51.85%
claude-4,所有类型汇总,42,30,58.33%
deepseek,UDM,9,13,40.91%
deepseek,UDR,8,15,34.78%
deepseek,DoS,8,19,29.63%
deepseek,所有类型汇总,25,47,34.72%
gemini-2.5,UDM,10,12,45.45%
gemini-2.5,UDR,0,23,0.0%
gemini-2.5,DoS,11,16,40.74%
gemini-2.5,所有类型汇总,21,51,29.17%
gpt-4.1,UDM,17,5,77.27%
gpt-4.1,UDR,17,6,73.91%
gpt-4.1,DoS,10,17,37.04%
gpt-4.1,所有类型汇总,44,28,61.11%
qwen,UDM,5,17,22.73%
qwen,UDR,5,18,21.74%
qwen,DoS,14,13,51.85%
qwen,所有类型汇总,24,48,33.33%
