
# Distributional MCTS Ablations & Sensitivity (SyntheticTree)

This repo contains a **minimal, runnable** implementation to reproduce the **Ablations and
Sensitivity Analysis** described in Section *5.3 ABLATIONS AND SENSITIVITY ANALYSIS* of the
uploaded paper. It includes:

- **Component ablations** (distributional Q-nodes vs scalar, TS vs UCB, power-mean backup)
- **Stochasticity sweeps** (deterministic/low/medium/high)
- **Hyperparameter sensitivity** for `p`, `C`, `N` (CATSO atoms), and `K` (PATSO particle cap)
- A **runtime micro-benchmark**

> This code focuses on the SyntheticTree environment, which is fully specified in the paper.
> It does not depend on external libraries beyond `numpy`, `pandas`, and `matplotlib`.

## Quick start

```bash
# (inside a Python 3.9+ environment)
pip install numpy pandas matplotlib

# run the full ablations suite (will write CSVs & a few example plots under ./outputs)
python experiments_ablation.py --outdir outputs

# run a small runtime benchmark
python runtime_benchmark.py
```

## Files

- `env_synth_tree.py` — SyntheticTree environment (k-ary depth-d tree) and exact DP for V\*.
- `mcts_core.py` — Distribution holders (CategoricalQ, ParticleQ, ScalarQ), Node/Edge stats, and the MCTS core utilities.
- `algorithms.py` — Action selectors for **CATSO**, **PATSO**, **ScalarTS+Optimism**, **UCT**, and **Power-UCT**.
- `mcts_runner.py` — Orchestrates one MCTS search with the chosen selector and power-mean backup.
- `experiments_ablation.py` — Entry-point to run all ablations and sensitivity sweeps.
- `runtime_benchmark.py` — Micro-benchmark for wall-clock time per decision.

## Notes

- SyntheticTree matches the description used in the paper. Reward appears only at the leaf; transitions are randomized with a tunable probability.
- **Optimism term** uses the polynomial form \(C\,T_s^{1/4}/\sqrt{T_{s,a}}\).
- **Power-mean backup** is used at V-nodes with selectable exponent `p∈{1,2,4,8,∞}`.
- **CATSO** uses a dynamic atom grid with Dirichlet counts; grid expands on-demand to include out-of-range samples.
- **PATSO** uses a particle list with a hard cap `K` and **merge-on-insert** to control memory; the merge preserves the first moment.

## Reproducibility

All experiments accept deterministic seeds. CSV outputs are tidy (long) format and can be post-processed to create the figures in the paper.
