# Planted-hierarchy controlled experiment

## Purpose

This experiment provides an independent ground-truth structural trend for evaluating NTS-distance. It is a controlled sanity check, not a benchmark.

## Setup

- Data are generated from a known planted hierarchy.
- A controlled fraction of samples is reassigned to different hierarchy leaves.
- Nuisance regimes: `baseline`, `small_n`, `large_n`, `high_noise`, `high_dim`, `noise_dims`, `imbalanced`.
- 10 repeats, 7 perturbation fractions, 490 representation pairs total.

Ground truth:

```text
d_GT = 1 - (1/L) * sum_{ell=1}^L ARI(P_ell, P_tilde_ell)
```

NTS distance:

```text
D_NTS = (1 - NTS) / 2
```

## Summary

| Metric | Spearman | Kendall | Pearson | Ranking accuracy |
|---|---:|---:|---:|---:|
| D_NTS_E | 0.958 | 0.823 | 0.971 | 0.912 |
| D_NTS_M | 0.956 | 0.820 | 0.968 | 0.910 |
| RTD | 0.900 | 0.732 | 0.906 | 0.866 |
| SRTD | 0.908 | 0.743 | 0.914 | 0.872 |
| RTD-lite | 0.787 | 0.606 | 0.803 | 0.803 |
| SRTD-lite | 0.891 | 0.721 | 0.899 | 0.861 |

## Files

- `planted_hierarchy_raw_results.csv`: raw results for all representation pairs.
- `planted_hierarchy_summary.csv`: summary correlations and ranking accuracy.
- `run_planted_hierarchy_experiment.py`: sanitized reproduction script.
