# KataGo Win-Probability Claim-Consistency Experiment

## Setup Summary

- Train dataset: `katago/example_train.jsonl`
- Eval dataset: `katago/example_eval.jsonl`
- Train rows: `8`
- Eval rows: `4`
- Variants: `lm_only, no_consistency_loss, rationale_only, full_consistency, random_consistency`
- Batch size: `4`
- Epochs: `5`
- LR: `0.0003`
- Model: `d_model=256`, `layers=4`, `heads=8`, `d_ff=1024`
- Max sequence length: `256`
- Consistency weight: `0.5`

## Results

| variant             | token_acc | claim_bin_acc | scalar_mse | mae_winprob | pearson_r_winprob | spearman_r_winprob | cfact_cls_follows_swap | cfact_cls_follows_orig | cfact_scalar_mse_to_swap |
| ------------------- | --------- | ------------- | ---------- | ----------- | ----------------- | ------------------ | ---------------------- | ---------------------- | ------------------------ |
| lm_only             | 0.253731  | 0.250000      | 0.018422   | 0.124634    | 0.747560          | 0.800000           | 0.250000               | 0.250000               | 0.012660                 |
| no_consistency_loss | 0.253731  | 0.000000      | 0.015259   | 0.092998    | 0.879609          | 1.000000           | 0.500000               | 0.000000               | 0.013909                 |
| rationale_only      | 0.156716  | 0.250000      | 0.025466   | 0.136320    | 0.887429          | 1.000000           | 0.250000               | 0.250000               | 0.011014                 |
| full_consistency    | 0.149254  | 0.250000      | 0.034114   | 0.129870    | 0.696828          | 0.800000           | 0.250000               | 0.250000               | 0.033793                 |
| random_consistency  | 0.089552  | 0.250000      | 0.019106   | 0.100707    | 0.822794          | 1.000000           | 0.750000               | 0.250000               | 0.008935                 |

## Variant Notes

- `lm_only`: Pure LM baseline: should model commentary tokens but usually underperforms on calibrated oracle claims.
- `no_consistency_loss`: LM + scalar baseline: can learn continuous win probability without directly tying rationale-pooled states to the bin claim.
- `rationale_only`: LM + rationale consistency only: strongest evidence that rationale-pooled hidden states alone can support the inline claim.
- `full_consistency`: Full objective: if this wins on bin accuracy and scalar quality, the shared hidden-state mechanism is working end to end.
- `random_consistency`: Control variant behavior.

## Interpretation Notes

- The earlier synthetic generated-rationale scalar experiment showed perfect claim-bin accuracy and perfect orig-following counterfactual behavior for `rationale_only` and `full_consistency`.
- This KataGo version is a stronger domain test because the language model emits both commentary and an inline claim while only the claim is supervised by an oracle.
- FEVER-from-scratch was a weaker bridge task because it supervised evidence classification rather than oracle-graded generated claims.
