# FEVER From-Scratch Transformer Claim-Consistency Results

## Setup

- Dataset: `copenlu/fever_gold_evidence`
- Train samples: 50,000 | Eval samples: 5,000
- Model: d_model=256, layers=4, heads=8, d_ff=1024
- Epochs: 10 | Batch size: 32 | LR: 0.0003
- Consistency loss weight: 0.5

## Results

| variant | params | train_minutes | final_lm_loss | final_cons_loss | gen_claim_acc | cls_claim_acc | cfact_cls_follows_swap | cfact_cls_follows_orig | matched_cfact_cls_follows_swap | matched_cfact_cls_follows_orig | cfact_gen_follows_swap | cfact_gen_follows_orig | shuffled_cls_acc | shuffled_gen_acc |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| no_consistency_loss | 16093699 | 9.8683 | 0.1218 | 1.0443 | 0.5180 | 0.3080 | 0.3820 | 0.2640 | 0.3560 | 0.2580 | 0.3140 | 0.4400 | 0.2700 | 0.4920 |
| evidence_only_pooling | 16093699 | 10.6377 | 0.1530 | 0.6220 | 0.5220 | 0.4356 | 0.3920 | 0.3080 | 0.4220 | 0.3100 | 0.3280 | 0.4020 | 0.3440 | 0.4880 |
| full_sequence_pooling | 16093699 | 11.4225 | 0.1353 | 0.0038 | 0.5360 | 0.9900 | 0.9720 | 0.0180 | 0.9820 | 0.0140 | 0.3180 | 0.4440 | 0.9760 | 0.4920 |
| claim_only_pooling | 16093699 | 10.6541 | 0.1571 | 0.1695 | 0.5320 | 0.5722 | 0.3300 | 0.4500 | 0.3840 | 0.4000 | 0.3500 | 0.4160 | 0.5080 | 0.5300 |