# Codebase for CompeteSMoE

---

## Overview

This repository provides code and instructions for training and evaluating the CompeteSMoE model on both Vision-Language and Language-only tasks. It builds upon existing frameworks such as LibMoE and MoEUT.

---

## Setup

### Vision-Language Model (VLM)

We use LibMoE for large-scale training. Please follow their Quick Start Guide for installation and setup.

For dataset preparation, refer to the Dataset Preparation Section in the LibMoE README.

### Language Model Pretraining

We use the MoEUT framework for training language models.

Install the required dependencies:

```bash
pip install -r ./moe_pretrain_model/requirements.txt
```

For dataset preparation, we use [SlimPajama-627B](https://huggingface.co/datasets/cerebras/SlimPajama-627B), which will be automatically downloaded when running `train.sh`. The SlimPajama dataset class is defined in `./moe_pretrain_model/framework/dataset/text/slimpajama.py`.

---

## Training

### Train Vision-Language Models

> ⚠️ Note: For the first two training stages (PT and PFT), we follow LibMoE.

#### Train SMoE Baseline

```bash
export TYPE_MOE="smoe"
echo "Starting SFT stage..."
bash ./scripts/train/phi35mini/siglip/sft.sh
```

#### Train CompeteSMoE

```bash
echo "Starting SFT stage for CompeteSMoE..."
bash ./scripts/train/phi35mini/siglip/sft_competesmoe.sh
```

To run the full training pipeline:

```bash
bash ./scripts/train/run_train_all.sh
```

---

### Train Language Models

All configuration files are available in:

```
./moe_pretrain_model/sweeps/
```

The `sweeps` directory contains sweep configurations for all experiments we performed.

Start training with:

```bash
bash ./moe_pretrain_model/train.sh
```

---

## Evaluation

### Evaluate Vision-Language Models

```bash
bash ./scripts/eval/run_eval.sh
```

Refer to the LibMoE evaluation guide for more detailed instructions.

### Evaluate Language Models

```bash
bash ./moe_pretrain_model/eval.sh
```
