DOCUMENT GENERATION BENCHMARK - DETAILED ANALYSIS REPORT
============================================================

Generated on: 2025-09-17 17:44:35

BASIC STATISTICS
------------------------------
Total queries processed: 40
Average user profile accuracy: 0.487
Average intent capture accuracy: 0.510
Average citation accuracy: 0.185
Average document quality score: 4.934
Overall average score: 1.262

ADVANCED METRICS
------------------------------
Score Statistics:
  user_profile_accuracy:
    Mean: 0.487 ± 0.174
    Median: 0.456
    Range: 0.209 - 0.818

  intent_capture_accuracy:
    Mean: 0.510 ± 0.207
    Median: 0.600
    Range: 0.200 - 1.000

  citation_accuracy:
    Mean: 0.185 ± 0.244
    Median: 0.000
    Range: 0.000 - 0.857

  document_quality_score:
    Mean: 4.934 ± 0.154
    Median: 5.000
    Range: 4.330 - 5.000

Correlations:
  profile_intent_correlation: -0.388
  intent_quality_correlation: -0.087
  citation_quality_correlation: 0.153
  profile_quality_correlation: 0.080

PERFORMANCE BY DOCUMENT TYPE
------------------------------
email: 1.234 (n=15)
status_report: 1.305 (n=18)
faq: 1.214 (n=7)

PERFORMANCE BY USER ROLE
------------------------------
Project Manager: 1.263 (n=27)
UX Designer: 1.260 (n=5)
Applied Scientist: 1.306 (n=4)
Product Manager: 1.249 (n=2)
Software Engineer: 1.190 (n=2)

CITATION ANALYSIS
------------------------------
Documents with citations: 40/40
Average citations per document: 29.18

Most cited messages:
  Msg_4169: 25 citations
  Msg_3457: 24 citations
  Msg_1354: 17 citations
  Msg_4283: 17 citations
  Msg_1654: 15 citations

QUALITY DIMENSIONS ANALYSIS
------------------------------
citation_quality: 4.975 ± 0.156
