# Linear Mode Connectivity on IMDB Reviews

This guide outlines the full pipeline for training, fine-tuning, and evaluating **Linear Mode Connectivity (LMC)** for Mixture-of-Experts (MoE) models using the **IMDB 50k Movie Review** dataset.

---

## Step 1: Download and Prepare the Dataset

Run the following script to download and extract the dataset from Kaggle:

```bash
bash src/imdbreview/data.sh
```
---

## Step 2: Train the Base Transformer Model

Train a baseline transformer model on the IMDB review dataset:

```bash
CUDA_VISIBLE_DEVICES=0 python src/imdbreview/train_model.py \
    --data-path ./data/imdbreview/IMDB_Dataset.csv \
    --save-dir ./weights/imdbreview/pretrained
```

---

## Step 3: Fine-tune with a Mixture-of-Experts (MoE) Layer

Inject and fine-tune a Mixture-of-Experts layer with different random seeds:

```bash
CUDA_VISIBLE_DEVICES=0 python src/imdbreview/finetune_moe.py \
    --model-path weights/imdbreview/pretrained/transformer-layers-1.flax \
    --data-path ./data/imdbreview/IMDB_Dataset.csv \
    --save-dir ./weights/imdbreview/finetune \
    --num-experts 1 \
    --num-shared-experts 0 \
    --num-gated-experts 1 \
    --moe-idx 0 \
    --topk 0 \
    --seed 20
```

Repeat this step with different random seeds (e.g., `--seed 20`) to generate models for interpolation.

---

## Step 4: Naive Linear Interpolation

Perform naive interpolation between two fine-tuned MoE models:

```bash
CUDA_VISIBLE_DEVICES=0 python src/imdbreview/naive_interpolate.py \
    --model-a weights/imdbreview/finetune/idx-0-shared-0-gated-1-topk-0-seed-0.flax \
    --model-b weights/imdbreview/finetune/idx-0-shared-0-gated-1-topk-0-seed-20.flax \
    --data-path ./data/imdbreview/IMDB_Dataset.csv
```

---

## Step 5: Calculate Loss Barrier
```bash
python src/imdbreview/loss_barrier.py \
    --file-path results/imdbreview/[idx-0-shared-0-gated-1-topk-0-seed-0.flax+idx-0-shared-0-gated-1-topk-0-seed-20.flax].json
```

Replace the placeholder with the actual path to the generated result.

---
