# Benchmark

## SAE Activation

```bash
export HF_TOKEN=<YOUR_HF_TOKEN>
export NEURONPEDIA_API_KEY=<YOUR_KEY>
export WANDB_API_KEY=<YOUR_KEY>

uv run python src/eliciting_contexts/benchmark/external/sae_activation/run_epo.py --use_assist --output /workspace/eliciting-contexts/data/epo/sae_test_output.json --config /workspace/eliciting-contexts/src/eliciting_contexts/benchmark/external/sae_activation/config.yaml
```

## Simple Stories

This benchmark does...

To run:

```bash
export HF_TOKEN=<YOUR_HF_TOKEN>
export NEURONPEDIA_API_KEY=<YOUR_KEY>
export WANDB_API_KEY=<YOUR_KEY>

uv run python src/eliciting_contexts/benchmark/external/tiny_stories/run_epo.py --use_assist --output /workspace/eliciting-contexts/data/epo/stories_test_output.json --config /workspace/eliciting-contexts/src/eliciting_contexts/benchmark/external/tiny_stories/config.yaml
```

## Backdoors

```bash
export HF_TOKEN=<YOUR_HF_TOKEN>
export WANDB_API_KEY=<YOUR_KEY>

uv run python src/eliciting_contexts/benchmark/external/backdoors/run_epo.py --output /workspace/eliciting-contexts/data/epo/backdoor_output.json 
```