模型名称,安全影响分类,实验结果为true的数据数,实验结果为false的数据数,成功率
claude-4,UDM,14,5,73.68%
claude-4,UDR,9,15,37.5%
claude-4,DoS,15,11,57.69%
claude-4,所有类型汇总,38,31,55.07%
deepseek,UDM,9,10,47.37%
deepseek,UDR,9,15,37.5%
deepseek,DoS,12,14,46.15%
deepseek,所有类型汇总,30,39,43.48%
gemini-2.5,UDM,15,4,78.95%
gemini-2.5,UDR,4,20,16.67%
gemini-2.5,DoS,14,12,53.85%
gemini-2.5,所有类型汇总,33,36,47.83%
gpt-4.1,UDM,15,4,78.95%
gpt-4.1,UDR,16,6,72.73%
gpt-4.1,DoS,13,13,50.0%
gpt-4.1,所有类型汇总,44,23,65.67%
qwen,UDM,5,14,26.32%
qwen,UDR,9,15,37.5%
qwen,DoS,18,8,69.23%
qwen,所有类型汇总,32,37,46.38%
