# 🚀 SLAKE on LongBench

This repository provides instructions for running **SLAKE** on **LongBench** benchmarks.  
SLAKE is a framework that improves **efficiency** and **accuracy** in long-context LLM inference by combining **KV-cache eviction** and **linear attention**.

---

## 📦 Environment Setup

- **Python**: `3.10`  
- **CUDA**: `12.2`  

### Required Packages
```bash
pip install transformers==4.43.3 
            torch==2.6.0+cu124 
            torchaudio==2.6.0+cu124 
            torchvision==0.21.0+cu124 
            tqdm==4.67.1 
            flash_attn==2.6.2
```

---

## 🧪 Running SLAKE on LongBench

Below are example commands for different models.  

---

### 🔹 LLaMA2-7B
```bash
# cache size 128
python pred.py --model llama2-7b-chat-4k 
               --cache_type SLAKE 
               --cache_size 128 
               --window_sizes 32 
               --alpha 0.5 --beta 0.4 
               --addr 11109

# cache size 256
python pred.py --model llama2-7b-chat-4k 
               --cache_type SLAKE 
               --cache_size 256 
               --window_sizes 32 
               --alpha 0.5 --beta 0.4 
               --addr 11109
```

---

### 🔹 LLaMA3.1-8B
```bash
# cache size 128
python pred.py --model llama-3.1-8B-instruct 
               --cache_type SLAKE 
               --cache_size 128 
               --window_sizes 32 
               --alpha 0.4 --beta 0.2 
               --addr 11109

# cache size 256
python pred.py --model llama-3.1-8B-instruct 
               --cache_type SLAKE 
               --cache_size 256 
               --window_sizes 32 
               --alpha 0.6 --beta 0.5 
               --addr 11109
```

---

### 🔹 Mistral-7B
```bash
# cache size 128
python pred.py --model mistral-0.3-7b-32k 
               --cache_type SLAKE 
               --cache_size 128 
               --window_sizes 32 
               --alpha 0.6 --beta 0.4 
               --addr 11109

# cache size 256
python pred.py --model mistral-0.3-7b-32k 
               --cache_type SLAKE 
               --cache_size 256 
               --window_sizes 32 
               --alpha 0.5 --beta 0.2 
               --addr 11109
```


