# Run Experiment Artifacts

This folder collects the experiment code, data, and result files available in the workspace and downloaded past-session assets. Files are separated by experiment family. Within each family, runnable source files are under `code/`, datasets under `data/`, metrics/results under `results/`, and reports or design notes under `docs/`.

## Folder layout

### `synthetic_claim_consistency/`

- `code/claim_consistency_experiment.py` — minimal synthetic claim-consistency experiment.
- `code/run_hard_experiment.py` — harder overlapping-vocabulary synthetic experiment.
- `results/results_comparison_hard.csv` — hard-run metrics.
- `results/results_comparison_hard.md` — hard-run result summary.
- `results/README.md` — original experiment README available in the workspace.

### `leancheck/`

- `code/leancheck_data.py` — LeanCheck data generation/loading utilities.
- `code/leancheck_model.py` — LeanCheck model code.
- `code/leancheck_experiment.py` — LeanCheck training/evaluation experiment.
- `code/leancheck_eval.py` — LeanCheck evaluation script.
- `data/leancheck_train.jsonl` — training set.
- `data/leancheck_eval.jsonl` — evaluation set.
- `data/leancheck_counterfactual.jsonl` — counterfactual swap set.
- `data/leancheck_minimal_pairs.jsonl` — minimal-pair set.
- `results/results_leancheck_gpt2_lean_1k_patch_head_p100.csv` — LeanCheck run metrics and patching results.
- `results/results_leancheck_gpt2_lean_1k_patch_head_p100.md` — LeanCheck result summary.
- `results/README_LEANCHECK.md` — original LeanCheck README.

### `code_coupling/`

The available code-coupling assets in this workspace are result and review artifacts recovered from past-session outputs, not the runnable experiment source.

- `results/metrics.csv` — main code-coupling metrics.
- `results/manual_review_scores.csv` — manual qualitative review scores.
- `results/qualitative_side_by_side.csv` — side-by-side qualitative comparison table.
- `results/full_run_manual_review_generated_explanations.csv` — generated explanations used for review.
- `docs/report.md` — code-coupling report.
- `docs/manual_qualitative_review.md` — qualitative review writeup.
- `docs/qualitative_side_by_side.md` — side-by-side qualitative writeup.
- `docs/full-run-manual-review-generated-explanations.md` — generated-explanation review artifact.
- `docs/code_coupling_experiment_report.md` — additional workspace report.

### `katago_go/`

The available KataGo/Go artifacts in this workspace are architecture notes, extracted results, and project memories, not the runnable training source or raw Go datasets.

- `docs/KataGo--LLM-Architecture.md` — KataGo-to-LLM architecture notes.
- `docs/KataGo-NL-Transformer-Build-Guide.md` — build guide and architecture details.
- `docs/katago_llm.md` — project memory summary for KataGo LLM work.
- `docs/deep_variations.md` — project memory on KataGo deep-variation work.

### `fever/`

The original full FEVER Python files and raw CSV exports were mentioned in the linked threads but were not downloadable through the available browser session. This package now includes the visible command lines, the visible evidence-masking code snippet, and a reconstructed tightened-rerun CSV from the exact values available in the threads and project inventory.

- `docs/fever_experiment.md` — FEVER experiment memory and result summary.
- `code/recovered_fever_commands.sh` — visible FEVER run commands recovered from the threads.
- `code/recovered_gradient_blocking_snippet.py` — visible evidence-only pooling snippet recovered from the tightened-run thread.
- `results/fever50k_rerun_20260428.reconstructed.csv` — reconstructed variant metrics from thread-visible/project-inventory values.

### `project_inventory/`

- `docs/go-space-experiment-inventory.pplx.md` — comprehensive inventory of every experiment/result/ablation compiled from the project.
- `docs/perplexity_experiment_extraction.md` — browser extraction from the linked Perplexity experiment threads.
- `docs/perplexity_go_space_inventory_2026-05-17.md` — earlier Space inventory artifact.

### `recovered_from_threads/`

This folder contains code fences, tables, mentioned filenames, and raw fetch outputs recovered directly from the linked Perplexity pages. These are useful for provenance and for reconstructing partial files that were visible in thread text but not downloadable as full artifacts.

- `RECOVERY_MANIFEST.md` — per-thread recovery summary from content fetch.
- `BROWSER_RECOVERY_REPORT.md` — browser recovery summary for artifact-heavy threads.
- `raw_fetches/` — raw JSON outputs from `pplx content fetch` for the supplied links.
- `code_coupling_design/` — visible code-coupling snippets, including oracle, dataset structure, tokenizer extension, model-change snippets, commands, and adversarial-pair snippet.
- `fever_tightened_run/` — visible FEVER tightened-run snippet, Modal command, tables, and mentioned files.
- `codex_go_rl_pilot/` — visible Codex pilot snippets and schema/tool/reward fragments.
- Other subfolders contain extracted content and tables for the architecture/design threads.

## Notes

- This package includes all experiment code and data files found in the current workspace plus downloaded past-session assets.
- Some experiments described in the paper draft were run in earlier threads but their full runnable source or raw datasets were not downloadable from the visible thread pages. Those are represented by result summaries, visible snippets, raw thread extractions, and reconstructed CSVs where exact numeric values were recoverable.
- The most complete runnable packages currently available are `synthetic_claim_consistency/` and `leancheck/`.
