Model,Dataset,Method,Metric,Value,tau,N
Llama-4-Scout,JailbreakBench,avg_pairwise_bertscore,AUROC,0.7672222222222222,,5
Llama-4-Scout,JailbreakBench,embedding_variance,AUROC,0.6536111111111111,,5
Llama-4-Scout,JailbreakBench,levenshtein_variance,AUROC,0.2891666666666666,,5
Llama-4-Scout,JailbreakBench,semantic_entropy,AUROC,0.685138888888889,0.1,5
Qwen-2.5-7B,JailbreakBench,avg_pairwise_bertscore,AUROC,0.615,,5
Qwen-2.5-7B,JailbreakBench,embedding_variance,AUROC,0.7205555555555556,,5
Qwen-2.5-7B,JailbreakBench,levenshtein_variance,AUROC,0.6013888888888889,,5
Qwen-2.5-7B,JailbreakBench,semantic_entropy,AUROC,0.6901388888888889,0.1,5
Llama-4-Scout,HarmBench,avg_pairwise_bertscore,AUROC,0.5057155921353451,,5
Llama-4-Scout,HarmBench,embedding_variance,AUROC,0.6837372351775645,,5
Llama-4-Scout,HarmBench,levenshtein_variance,AUROC,0.3968907178783722,,5
Llama-4-Scout,HarmBench,semantic_entropy,AUROC,0.6912818167962201,0.1,5
Qwen-2.5-7B,HarmBench,avg_pairwise_bertscore,AUROC,0.4311842706904435,,5
Qwen-2.5-7B,HarmBench,embedding_variance,AUROC,0.7242798353909465,,5
Qwen-2.5-7B,HarmBench,levenshtein_variance,AUROC,0.572778539856729,,5
Qwen-2.5-7B,HarmBench,semantic_entropy,AUROC,0.7325864959609816,0.1,5
