# Prompts

Mirror of every prompt currently used in the project, extracted for easy
review/diffing.

**Inference prompts** (`kimina_inference_*.txt`) are loaded at runtime by
[`cluster/run_gemini_inference.py`](../cluster/run_gemini_inference.py) — change
them and the next inference run picks up the new content.
[`gpu_inference_Kimina-Prover-RL-1-7B.py`](../ProofBridge_PaperOriginal_Imported/llm_inference/gpu_inference_Kimina-Prover-RL-1-7B.py)
still has them hardcoded; mirror any change here back into that source until
it's also refactored to load from this folder.

**SC prompts** (`sc_v3_*.txt`) are read-only mirrors; the canonical copies live
in `sc_combined_v3.py` and are NOT loaded from these files.

**Edit-judge prompts** (`edit_judge_*_v2ctx.txt`) are read-only mirrors of the
FR/RR/OUR judges for Exp2/3/4; canonical copies live in
[`number_edit/score_number_edit_region.py`](../number_edit/score_number_edit_region.py),
[`symbol_edit/score_symbol_edit_region.py`](../symbol_edit/score_symbol_edit_region.py),
[`step_edit/score_step_delete.py`](../step_edit/score_step_delete.py). All three
share the same evidence-grounded design (anti-hallucination instruction +
`context_in_fl` / `evidence_in_fl` field requiring the judge to copy the
literal Lean snippet supporting its verdict).

Per memory `feedback_unified_inference_prompt`: **all models share the same
inference prompt content**; only the chat template adapts per model.

## Index

| File | Used by | Source location | When fired |
|---|---|---|---|
| `kimina_inference_system.txt` | All inference (Kimina vLLM, Gemini API, future models) | Loaded at runtime by [`cluster/run_gemini_inference.py`](../cluster/run_gemini_inference.py); also hardcoded at [`gpu_inference_Kimina-Prover-RL-1-7B.py:377`](../ProofBridge_PaperOriginal_Imported/llm_inference/gpu_inference_Kimina-Prover-RL-1-7B.py#L377) | system role / `system_instruction` |
| `kimina_inference_user.txt` | All inference | Loaded by `run_gemini_inference.py`; hardcoded in [`wrap_prompt_in_query`](../ProofBridge_PaperOriginal_Imported/llm_inference/gpu_inference_Kimina-Prover-RL-1-7B.py#L291) | user role; placeholders: `{example_block}`, `{informal_statement}`, `{informal_proof}` |
| `kimina_inference_examples.txt` | FS mode only (1.7B base FS — the only retained FS run) | `--fs` flag in `run_gemini_inference.py`; `HARDCODED_EXAMPLES` in [`gpu_inference_Kimina-Prover-RL-1-7B.py:176`](../ProofBridge_PaperOriginal_Imported/llm_inference/gpu_inference_Kimina-Prover-RL-1-7B.py#L176) | Substituted into `{example_block}` |
| `sc_v3_stmt.txt` | SC v3 — StmtSC judge (Call 1) | [`sc_combined_v3.py:CALL_STMT_V3`](../ProofBridge_PaperOriginal_Imported/llm_inference/sc_combined_v3.py) | Per (problem × sample) during TC+SC eval |
| `sc_v3_proof.txt` | SC v3 — ProofSC judge (Call 2) | [`sc_combined_v3.py:CALL_PROOF_V3`](../ProofBridge_PaperOriginal_Imported/llm_inference/sc_combined_v3.py) | Per (problem × sample) — skipped when proof body is degenerate |
| `edit_judge_number_v2ctx.txt` | Exp2 number-edit FR/RR/OUR judge (region-restricted + evidence) | [`score_number_edit_region.py:SCORE_PROMPT`](../number_edit/score_number_edit_region.py) | Per (model × dataset × case); placeholders: `{source}`, `{edit_type}`, `{old_value}`, `{new_value}`, `{context}`, `{informal_statement}`, `{informal_proof}`, `{generated_fl}` |
| `edit_judge_symbol_v2ctx.txt` | Exp3 symbol-edit FR/RR/OUR judge | [`score_symbol_edit_region.py:SCORE_PROMPT`](../symbol_edit/score_symbol_edit_region.py) | Per case; placeholders: same shape as number, plus `{old_symbol}`, `{new_symbol}`, `{family}` |
| `edit_judge_step_v2ctx.txt` | Exp4 step-delete FR/RR/OUR judge | [`score_step_delete.py:SCORE_PROMPT`](../step_edit/score_step_delete.py) | Per case; placeholders: `{reasoning_text}`, `{outcome_text}`, `{is_last}`, `{informal_proof}`, `{generated_fl}` |
| `variant_rephrase.txt` | Exp1 NL rephrase generation | Loaded by [`generate_variants.py`](../datasets_validation/generate_variants.py) | Offline, once per (dataset × `--style rephrase`); placeholders: `%STATEMENT%`, `%PROOF%`, `{faithfulness_rules}` |
| `variant_step.txt` | Exp1 NL step-by-step rewrite | Loaded by `generate_variants.py` | Offline, `--style step`; same placeholders |
| `variant_faithfulness_rules.txt` | Shared rules block | Injected into both `variant_rephrase.txt` and `variant_step.txt` via `{faithfulness_rules}` | — |

## Conventions

- Filenames lowercase with `_` separators.
- All prompt files use UTF-8. Lean Unicode (∑, ≤, ↔, ⟨⟩, etc.) is preserved
  exactly as the source uses it.
- Placeholder names match the Python `.format()` keys exactly so a reader can
  match prompt → caller without hunting.

## Not extracted (legacy / unused)

- `evaluate.py:prompt_LLM` — old bidirectional-equivalence judge prompt; not
  imported by anything in the current pipeline. If/when revived, mirror it
  as `legacy_bidir_equiv.txt`.
