# M2F Supplementary Material for AI4Math 2026

This archive contains the anonymous supplementary material for the workshop submission
`M2F: Automated Formalization of Mathematical Literature at Scale`.

## Contents

- `M2F_results/`: released generated Lean workspace. It contains the buildable Lean projects, pinned Lean toolchain file, `lakefile.lean`, and `lake-manifest.json`.
- `mathdoc-parser-main/`: document-to-item parser used before statement compilation. It converts source documents into structured JSON-style items consumed by the statement pipeline.
- `codex_agent_bookformalization/`: anonymized orchestration code, prompt templates, agent configurations, verifier wrappers, and pipeline scripts for statement compilation and proof repair.
- `BUILD_REPORT.md`: pinned-environment and artifact-status summary.

## Scope

The generated Lean workspace is intended for artifact inspection and build-status verification under the pinned Lean/`mathlib` environment. Re-running the full raw-source-to-Lean pipeline requires external source documents, model credentials, and access to comparable model services. Those inputs are not included in this anonymous supplementary archive.

## Size Summary

- Total archive folder: about 20 MB.
- Generated Lean workspace: about 19 MB.
- Parser code: about 308 KB.
- Orchestration code and prompts: about 1.3 MB.
- Files: 718 total, including 591 Lean source files in `M2F_results/`.

## Anonymization Notes

The archive excludes repository metadata, local build products, `.lake` folders, Python bytecode caches, raw logs, local source PDFs/LaTeX files, credentials, and user-specific paths. The generated Lean workspace and code are provided as anonymous artifacts for review.
