# Failure Modes

This file summarizes the failure categories observed during the logged May
2026 workflow window. The ledger supports aggregate PR counts; exact
Claude-vs-Codex performance comparisons should not be inferred from the
anonymized provenance.

## Summary Table

| Failure mode | Typical symptom | Primary detector | Mitigation |
|---|---|---|---|
| Fabricated import or file path | `lake build` fails before theorem checking | Gate 1 build | Add compositional source pointer and require local build before PR |
| Stale base | PR compiles in old worktree but fails after trunk changes | Gate 1 build | Rebase worktree from release-candidate before opening PR |
| Parallel cache race | transient `.lake` or cache failure under concurrent builds | Gate 1 build or local retry | Isolate worktrees; retry cache step; avoid sharing mutable build directories |
| Literal proof escape hatch | `sorry` or `admit` appears in source | Gate 3 regex plus `warningAsError` | Reject PR; require closed proof term |
| Custom axiom or constant | source introduces `axiom` or `constant` | Gate 3 regex | Reject PR; require theorem from Mathlib/FormalSLT primitives |
| Noncanonical axiom trace | showcase theorem depends on an axiom outside the allowed set | Gate 4 axiom audit | Reject PR; replace proof route or weaken unsupported dependency |
| Statement drift | Lean theorem is stronger/weaker/different from the informal target | Human review | Revise target signature; document non-claims in theorem manifest |
| Overbroad edit | PR changes multiple theorem families or refactors unrelated files | Human review | Split into theorem-family lane PRs |

## Why Gate 4 Matters

Lean compilation alone proves that a term inhabits the stated type in the
current environment. The showcase axiom audit adds a stronger review surface:
for each public showcase declaration, the CI records the axiom set reported by
`#print axioms` and fails if any trace is outside:

```text
propext
Classical.choice
Quot.sound
```

This is not a proof of all Mathlib soundness assumptions. It is a mechanical
audit that the showcased FormalSLT theorems do not introduce noncanonical
dependencies beyond the standard Mathlib axiom set used here.

## Statement Drift Is Not A Kernel Error

The Lean kernel checks proof correctness, not theorem intent. A theorem can
typecheck while proving a claim that is too weak, too strong, or scoped
differently from the informal target. That is why the workflow keeps a human
statement-adequacy review and ships `THEOREM_MANIFEST.md`.

## Provenance Caution

The anonymized ledger preserves complete aggregate workflow counts but only
partial tool-specific provenance. It should be used for aggregate throughput,
family-level breakdowns, and failure-mode discussion. It should not be used to
claim that Claude or Codex outperformed the other.
