# ICLR Unified Pipeline (Modes 1–4)

## Files
- `iclr_runner.py` — main entry point.
- `models.py` — model registry (4 models pre-registered).
- `guard.py` — runtime guard (Mode 3) with keyword/fuzzy/semantic/sequence checks.
- `data_prep.py` — creates/loads standardized datasets for each mode.
- `mode1.py` — dataset sanitization (model-agnostic).
- `mode2_dpo.py` — DPO+LoRA alignment (falls back to simulated metrics when deps missing).
- `mode3.py` — guard evaluation across levels per model.
- `mode4.py` — red-team evaluation (simulated multi-trial metrics).
- `utils.py` — utility helpers incl. Excel writer.

## How to run
```bash
python iclr_runner.py --fast          # quick demo run (smaller data)
python iclr_runner.py                 # full-sized demo data
python iclr_runner.py --models llama3-8b qwen2.5-72b  # subset of models
python iclr_runner.py --out ./runs/iclr_2026_exp      # custom output dir
```

## Outputs
- CSV per mode under `./iclr_results/` (or your `--out` dir)
- Excel workbook: `iclr_mode2_3_4_summary.xlsx` with sheets:
  - Mode2_alignment (per-model ASR before/after alignment)
  - Mode3_guard (per-model metrics across guard levels)
  - Mode4_redteam (per-model base vs aligned metrics)
