========================================
📋 Task: math50k | Split: train
🔑 Keys Mapped -> Q: 'instruction', A: 'output', GT: 'answer'
========================================
🧪 Ground Truth Type Analysis (from gt_key)
========================================
Type [FLOAT]: 0 (0.00%)
Type [ABC]: 0 (0.00%)
Type [TEXT]: 50000 (100.00%)
========================================
📊 Raw Text Length Analysis (chars)
========================================
Question (instruction):
  Mean : 244.69
  Std  : 185.89
  Min  : 21
  Max  : 3579
  Count: 50000

Output (output):
  Mean : 1141.13
  Std  : 638.92
  Min  : 1
  Max  : 6333
  Count: 50000

========================================
🔢 Tokenized Length Analysis (tokens)
========================================
Question Tokens:
  Mean : 81.02
  Std  : 55.32
  Min  : 20
  Max  : 1655
  Count: 50000

Answer Tokens:
  Mean : 412.30
  Std  : 263.88
  Min  : 2
  Max  : 4321
  Count: 50000

Total Tokens:
  Mean : 493.31
  Std  : 283.98
  Min  : 36
  Max  : 4383
  Count: 50000

--------------------
--- Question Tokens (instruction) ---
<= 128: 43368 (86.74%)
<= 256: 49428 (98.86%)
<= 512: 49934 (99.87%)
<= 1024: 49990 (99.98%)

--- Answer Tokens (output) ---
<= 128: 5316 (10.63%)
<= 256: 16118 (32.24%)
<= 512: 35222 (70.44%)
<= 1024: 48730 (97.46%)

--- Total Tokens ---
<= 128: 2039 (4.08%)
<= 256: 10787 (21.57%)
<= 512: 29516 (59.03%)
<= 1024: 47631 (95.26%)
========================================