# End-to-end ProbeSwitch transfer on external tasks (t=0.12 vs safe t=0.22)

Goal: move beyond “decision evidence” (predicting when BERW should win) and show **end-to-end**
that a *fixed* threshold policy can be applied on new tasks, with an interpretable safety/optimality trade-off.

Setup:
- Decision rule: misranking-probe thresholding (two actions)
  - low-misranking → `CMA-ES-sep`
  - high-misranking → `BERW-HeteroRobust`
- Two fixed thresholds:
  - aggressive transfer: `t=0.12`
  - safe default: `t=0.22` (reduces negative-transfer risk)

## Per-task evidence packs

- RL (CartPole): `evidence/application_rl_cartpole_heavytail_quadratic_cost_probeswitch_mr_transfer/`
- HPO (digits0): `evidence/application_hpo_noisy_logreg_digits0_sigma1p0_probeswitch_mr_transfer/`
- LQR (state-dependent heavy-tail): `evidence/application_lqr_heavytail_control_fixed_budget_resample_probeswitch_mr_transfer/`

Each pack contains:
- `final_boxplot.png`
- `runs.csv`
- `summary.csv`
- `probe_values.csv`
- `pairwise_sign_test_<metric>.csv` (paired sign-test across seeds)

## Cross-task summary

- Table: `evidence/probeswitch_external_transfer/summary.csv`
- Plot (win rate of ProbeSwitch vs CMA): `evidence/probeswitch_external_transfer/winrate_switch_vs_cma.png`

Generated by: `python3 tools/summarize_probeswitch_external_transfer.py --out-dir evidence/probeswitch_external_transfer`.
