# H2 Response Generation Log - meta-llama/Llama-4-Scout-17B-16E-Instruct

Generated: 2025-08-23 07:07:49

## Summary

- **Model:** meta-llama/Llama-4-Scout-17B-16E-Instruct → meta-llama/llama-4-scout
- **Input dataset:** /research_storage/data/processed/h2_harmbench_twins_test.jsonl
- **Total prompts:** 162
- **Overall success rate:** 100.0% (162 successful)
- **Output file:** /research_storage/outputs/h2/llama-4-scout-17b-16e-instruct_h2_responses.jsonl

## Comprehensive Metrics

### Success Rates by Label
- **Harmful prompts:** 81 total | 81 success | 0 failed | **100.0% success rate**
- **Benign prompts:** 81 total | 81 success | 0 failed | **100.0% success rate**

### Response Generation Metrics
- **Total responses generated:** 806
- **Empty responses encountered:** 4
- **Average response length:** 2088 characters
- **Response length range:** 23 - 6164 characters

### Processing Performance
- **Total processing time:** 5410.3 seconds
- **Average time per prompt:** 33.40 seconds
- **Processing rate:** 1.8 prompts/minute

## Generation Parameters

- **n_responses:** 5
- **temperature:** 0.7
- **top_p:** 0.95
- **max_new_tokens:** 1024

