# Mixture of Groups

## 1. Environment Setup

```bash
conda create -n mog python=3.10
conda activate mog
conda install nvidia/label/cuda-12.1.0::cuda-toolkit
conda install pytorch==2.4.0 torchvision=0.19.0 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt
```

Install PEFT:

```
cd peft
pip install -e .
```

## 2. Dataset

We use the publicly available commonsense-170k dataset for fine-tuning. Download it from [here](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/ft-training_set/commonsense_170k.json). We use 8 datasets for testing, downloaded from [here](https://github.com/AGI-Edgerunners/LLM-Adapters/tree/main/dataset). Organize the files in the project directory as follows:

```bash
# train dataset
commonsense_170k.json
# test commonsense datasets
./dataset
```

## 3. Quick Start with MoG

```python
import torch
from peft import GroupLoraConfig, get_peft_model
from transformers import AutoModelForCausalLM

# Pretrained model
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Meta-Llama-3.1-8B-Instruct", torch_dtype=torch.bfloat16, device_map="auto"
)
# Fine-tuning with MoG
peft_config = GroupLoraConfig(
    task_type="CAUSAL_LM",
    r=32,  
    alpha=32,
    group_size=4,  # Number of layers per group
    num_layers=32, # Total number of layers in the model
    target_modules=["q_proj", "k_proj", "v_proj"],
)
# Adapt the pretrained model with the fine-tuning module
model = get_peft_model(model, peft_config)
# load dataset and training
```

## 4. Fine-tuning on Llama3-8b

```bash
# fine-tuning & test
sh finetune.sh
sh test.sh
```

