# Rank-efficient Mixture of Expert for LLM Finetuning

The codebase for this work is based upon the [MoE-PEFT](https://github.com/TUDB-Labs/MoE-PEFT) repository.

## Installation

First, clone the MoE-PEFT repository and follow the instructions therein to setup the initial environment.
```
git clone https://github.com/TUDB-Labs/MoE-PEFT.git
cd MoE-PEFT
git checkout 40f8cb742cb20eda861099563c96fbb5e6ff3bc2
```

Next, copy over the contents of the folder `RE-MoE` into `MoE-PEFT` to enable the new `SharedLoRA` and `OperA` adapters.

```
cp -r RE-MoE-ICLR/RE-MoE/* MoE-PEFT/
git apply MoE-PEFT/mod.patch
```

## Experiments

We provide the configurations for `SharedLoRA` and `OperA` for the `Llama3-8B` model in the folder `configs`. First copy them into the `MoE-PEFT` folder
```
cp RE-MoE-ICLR/configs MoE-PEFT
```

They run them with
```
python MoE-PEFT/moe_peft.py --base_model "meta-llama/Meta-Llama-3-8B" --config "MoE-PEFT/configs/<config_file.json>" --seed 42 --log_file llama3_<adapter>.log --bf16
```
to reproduce the results reported in the paper.