# Penn Treebank (Word Level) - MoE Linear Mode Connectivity Pipeline

This document outlines the complete experimental pipeline for training, fine-tuning with Mixture-of-Experts (MoE), performing weight matching, and evaluating linear mode connectivity on the Penn Treebank dataset at the word level.

---

## Step 1: Preprocess the Dataset

```bash
python src/penn/preprocess.py \
    --tokenizer-model gpt2 \
    --dataset-split-type train \
    --output-path ./data/penn/penn.train
````

---

## Step 2: Train the Base Model

You can use either of the following training commands:


```bash
WANDB_MODE=offline CUDA_VISIBLE_DEVICES=0 python src/penn/train_model.py \
    --model-config-name gpt2 \
    --train-dataset-paths ./data/penn/penn.train-00000-of-00001 \
    --eval-dataset-paths ./data/penn/penn.valid-00000-of-00001 \
    --model-save-dir /root/weights/penn/
```

---

## Step 3: Fine-tune with MoE

```bash
CUDA_VISIBLE_DEVICES=0 python src/penn/finetune_moe.py \
    --model-path /root/weights/penn/Main-lr0.0002-epoch10-batch64/checkpoint_1329 \
    --train-dataset-paths ./data/penn/penn.train-00000-of-00001 \
    --eval-dataset-paths ./data/penn/penn.valid-00000-of-00001 \
    --moe-layer-indices 0 \
    --num-shared-experts 0 \
    --num-routed-experts 2 \
    --topk 2 \
    --seed 0 \
    --model-save-dir /root/weights/penn/finetune
```

---

## Step 4: Perform Weight Matching Between Fine-Tuned MoE Models

```bash
CUDA_VISIBLE_DEVICES=0 python src/penn/weight_matching_moe.py \
    --model-a /root/weights/penn/finetune/lr0.0002-topk2-shared0-routed2-seed0/checkpoint_1329 \
    --model-b /root/weights/penn/finetune/lr0.0002-topk2-shared0-routed2-seed20/checkpoint_1329 \
    --train-dataset-paths ./data/penn/penn.train-00000-of-00001 \
    --eval-dataset-paths ./data/penn/penn.test-00000-of-00001
```

---

## Step 5: Evaluate Loss Barrier for Interpolated Models

```bash
python src/penn/loss_barrier.py \
    --file-path /root/results/penn/[lr0.0002-topk2-shared0-routed2-seed0+lr0.0002-topk2-shared0-routed2-seed20].json
```

