# PROXYATTN: GUIDED SPARSE ATTENTION VIA REPRESENTATIVE HEADS

## RULER Evaluation

```bash
# For Llama model
sh eval/ruler/run_head.sh
# For Qwen model
sh eval/ruler/run_head_qwen.sh
```

## Implementation

See ```ops/auxhead_attention.py``` for details.