### Filenames
* `results_comparison.csv`
* `README.md`
* `claim_consistency_experiment.py`
* `claim_consistency_coupling_experiment_executed.ipynb`
* `results_comparison_hard.md`
* `results_comparison_hard.csv`
* `results_comparison_scaled.csv`
* `results_hidden_state_intervention.csv`
* `fever50k_full_20260428.csv`
* `fever_pretrained_gpt2_experiment.py`
* `fever50k_full_20260428.md`
* `manifest.json`

### Data Tables and Metrics

#### Synthetic Coupling Test (Easy/Non-overlapping)
| Metric | Consistency-Trained Variants | Baseline (No Consistency Loss) |
| :--- | :--- | :--- |
| Classifier Accuracy (Rationale Hidden States) | 100% | 3.9% |
| Generation Accuracy (`full_sequence`) | 93.8% | 75% |
| Generation Accuracy (`rationale_only`) | 63% | - |
| Generation Accuracy (`earlier_token_only`) | 62% | - |
| Counterfactual Swap-Following (Classifier) | 100% | 67% |
| Original State Sticking (Classifier) | 0% | 3% |
| Counterfactual Swap-Following (Gen - `full_seq`) | 93.8% | - |
| Original State Sticking (Gen - `full_seq`) | 1.6% | - |
| Shuffled-Pairing Control Accuracy | 8-10% | 8-10% |

#### Synthetic Experiment Parameters
| Parameter | Value |
| :--- | :--- |
| Training Samples | 512 |
| Evaluation Samples | 128 |
| Counterfactual Samples | 64 |
| Latent States | 8 |
| Training Epochs | 5 |

#### Hard Synthetic Experiment (50% Overlapping Vocabulary)
| Metric | Consistency-Trained Variants | Baseline (No Consistency Loss) |
| :--- | :--- | :--- |
| Classifier Accuracy | 100% | 4.7% |
| Counterfactual Swap-Following (Classifier) | 100% | 6.3% |
| Generation Accuracy (General) | 81-100% | 100% |
| Generation Accuracy (`rationale_only`) | 100% | - |
| Generation Accuracy (`earlier_token_only`) | 100% | - |
| Generation Accuracy (`full_sequence`) | 81% | - |
| Counterfactual Gen Swap-Following (`full_seq`) | 77% | - |

#### FEVER Experiment (Pretrained GPT-2)
| Variant | Classifier Accuracy | Counterfactual Swap-Following | Generation Accuracy |
| :--- | :--- | :--- | :--- |
| `full_sequence_pooling` | 84% | 28-48% | - |
| `claim_only_pooling` | 83% | 28-48% | - |
| `evidence_only_pooling` | 44% | 48% | - |
| `no_consistency_loss` (Baseline) | 40% | - | 80% |

#### Causal Intervention Results
| Metric | Result |
| :--- | :--- |
| Hidden-State Intervention Success | 73-89% |
| `claim_only` Control Coupling | 43% |

### JSON Snippet
```json
{"status": "safe", "confidence": 0.92}
```

### Downloadable Artifacts
* `results_comparison.csv`
* `results_comparison_hard.csv`
* `results_comparison_scaled.csv`
* `results_hidden_state_intervention.csv`
* `fever50k_full_20260428.csv`
* `manifest.json`
* `README.md`
* `results_comparison_hard.md`
* `fever50k_full_20260428.md`
* `claim_consistency_experiment.py`
* `fever_pretrained_gpt2_experiment.py`
* `claim_consistency_coupling_experiment_executed.ipynb`