# Third-party assets and licenses

This repository combines our own code (released under MIT for the
camera-ready, see `LICENSE` after de-anonymization) with several
third-party assets.

## Models

| Asset | License | Use |
|---|---|---|
| OLMo-3-7B-Instruct-SFT (`allenai/OLMo-3-7B-Instruct`) | Apache 2.0 | Base model for SFT and RL stages |
| DSR teacher (used to distil SFT traces; cited in paper §3) | per the model card | Trace distillation only; teacher model not redistributed here |
| `distilbert-base-uncased` (HF) | Apache 2.0 | Backbone for the v90 9-class span classifier |

## Frameworks and libraries

| Library | License | Use |
|---|---|---|
| VERL 0.7.0 (`volcengine/verl`)                                  | Apache 2.0 | RL training harness |
| vLLM 0.12.0 (`vllm-project/vllm`)                               | Apache 2.0 | Rollout / inference engine |
| PyTorch 2.9.0                                                   | BSD-3      | Tensor / autograd |
| transformers 4.56.1                                             | Apache 2.0 | Model loading |
| peft 0.17.1                                                     | Apache 2.0 | LoRA |
| flash-attn 2.8.3                                                | BSD-3      | Attention kernels |
| flashinfer 0.6.2                                                | Apache 2.0 | SM90a / SM100a attention kernels |
| Ray 2.49.1                                                      | Apache 2.0 | Distributed orchestration |
| wandb 0.24.0                                                    | MIT        | Optional metrics logging |
| `lm-evaluation-harness` (EleutherAI)                            | MIT        | Eval harness; we ship custom task YAMLs under `evaluate/custom_tasks/` |
| `math_verify` (Hugging Face)                                    | Apache 2.0 | Math answer verification in `compute_pass_at_k.py` |
| Hydra / OmegaConf                                               | MIT / BSD-3 | Config management |

## Datasets and benchmarks

| Asset | License | Use |
|---|---|---|
| OlymMATH (Easy / Hard)       | per the original release (`Hothan/OlympiadBench`) | Out-of-domain math eval; primary metric in paper |
| AIME 2024 (`Maxwell-Jia/AIME_2024`)                              | per source | OOD math eval |
| AIME 2025 (`yentinglin/aime_2025`)                               | per source | OOD math eval |
| HMMT problems                                                    | per source (problems are public) | OOD math eval; we mirror with our scoring template at `anon-neurips26/hmmt_combined` |
| OMEGA Explorative (`OMEGA-Bench/...`)                            | per OMEGA release | OOD math eval |
| MATH-500 (`HuggingFaceH4/MATH-500`)                              | MIT        | OOD math eval |
| Simon Tatham's Portable Puzzle Collection (puzzle generators)    | MIT        | All puzzle data is procedurally generated from these solvers |

## Puzzle data

The puzzle datasets shipped on the anon HF org are derived from the
Simon Tatham puzzle collection's solvers. The solvers themselves are not
redistributed here; only the generated `(initial_state, solution)` pairs
in JSON / parquet form. The MIT license of the upstream solvers permits
this redistribution; we credit the upstream project in the paper's
related-work section and in the dataset cards on HF.

## Anonymization

Author and institutional copyright lines are removed for the duration of
double-blind review. The camera-ready repository will include the standard
MIT `LICENSE` file with the de-anonymized author list.
