# Code submission

## Evaluation figure generation
- `eval_token_acc_tradeoff.ipynb`:
  - For cost/accuracy tradeoff evaluation
- `eval_latency_analysis.ipynb`
  - For latencies/TTFTs at default QPM
  - Latencies as a function of QPM
  - FCFS vs. complexity-aware
- `eval_correctness_pred_acc.ipynb`
  - Per-layer accuracy bar plot

## Key scripts

### Training data collection
`datagen_*.py`

### Per-branch correctness predictor training
`train_*_dsk-r1-llama.py`

### Request difficulty prediction for MATH

`train_request_difficulty_prediction_math.py`

### Cost-accuracy tradeoff evaluations

`token_accuracy_measurement.py`

### Latency evaluations

Commands in `latency_exp_settings.md` calls the following:

- `latency_traces.py`
- `latency_measurement_vllm.py`
- `latency_measurement_torch.py`

