# Synthetic Validation Data Schema (for Section 7.2 / Fig S1–S4)

This package contains **three CSVs** that fully satisfy the requirements for **multi-seed risk estimation** and **multi-c (probe compute) decay fitting**.

## 1) `synthetic_task_specs.csv` (task-level)
One row per synthetic task.

**Primary keys**
- `task_id` (string)

**Task control knobs (phase-diagram axes candidates)**
- `noise_rate` (float, 0–1): label / observation noise
- `signal_strength` (float): separability / signal-to-noise proxy
- `domain_shift` (float): train-test shift intensity
- `model_mismatch` (float): teacher–student mismatch
- `n_train` (int): training set size

**Optional for sanity checks (NOT required in the paper)**
- `regime_true` (string): heuristic ground-truth regime label used by the generator
- `alpha_true` (float): ground-truth decay exponent used by the generator
- `K_true` (float): ground-truth decay scale used by the generator
- `R_star_proxy` (float): proxy upper-bound performance

## 2) `synthetic_run_logs.csv` (run-level; MUST-HAVE)
One row per `(task_id, seed, probe_c)`.

**Primary keys**
- `task_id` (string)
- `seed` (int)
- `probe_c` (int): compute budget for probing (e.g., steps/tokens)

**Outcomes**
- `R_true` (float): final metric for the run (full training)
- `R_pred` (float): predictor output given prefix info at compute `probe_c`
- `squared_error` (float): `(R_true - R_pred)^2`

**Prefix / probe features X_d(c) (examples; you can replace with your real ones)**
- `probe_loss`
- `loss_decay_early`
- `grad_norm`

**Static features X_s (copied from task specs for convenience)**
- `noise_rate`, `signal_strength`, `domain_shift`, `model_mismatch`, `n_train`

## 3) `synthetic_risk_curves.csv` (aggregated; derived)
One row per `(task_id, probe_c)`.

- `L_hat`: mean squared error over seeds
- `Var_R`: variance of `R_true` across seeds (conditional variance component)
- `mean_R`: mean of `R_true` across seeds
- `n_seeds`: number of seeds aggregated
- (plus task knobs for plotting)

## Recommended defaults (paper-ready density)
- tasks: 5 (noise) × 4 (signal) × 3 (shift) × 3 (mismatch) = 180 tasks
- seeds per task: 5
- probe_c grid: {25, 50, 100, 200}
- total run rows: 180 × 5 × 4 = 3600
