# Linear Mode Connectivity on MNIST

This guide outlines the full pipeline for training, fine-tuning, and evaluating **Linear Mode Connectivity (LMC)** for Mixture-of-Experts (MoE) models using the **MNIST** dataset.

---

## Change Working Directory

To begin, navigate to the project directory:

```bash
cd lmc-moe
```

## Step 1: Prepare the Dataset

The dataset is automatically loaded from `tensorflow.datasets` within the code.

---

## Step 2: Train the Base Model

```bash
CUDA_VISIBLE_DEVICES=0 python -m src.mnist.mnist_vit_train \
    --seed 0 \
    --optimizer adam \
    --learning-rate 0.0003 \
    --num-layers 6 \
    --ckpt-path /pretrain
```

---

## Step 3: Fine-tune with a Mixture-of-Experts (MoE) Layer

```bash
CUDA_VISIBLE_DEVICES=0 python python -m src.mnist.mnist_vit_finetune_moe \
    --seed 0 \
    --optimizer sgd \
    --learning-rate 1e-4  \
    --num-layers 2 \
    --num-experts 4 \
    --ckpt-path /ckpts \
    --model-path ./pretrain/mnist_vit_seed0
```

Repeat this step with a different random seed (e.g., `--seed 20`) to generate another fine-tuned model for matching.

---

## Step 4: Perform our proposed Weight Matching algorithm on 2 model checkpoints

```bash
CUDA_VISIBLE_DEVICES=0 python -m src.mnist.weight_matching_mnist \
    --num-layers 12 \
    --num-experts 4 \
    --model-a /ckpts/mnist_vit_finetune_moe_seed0 \
    --model-b /ckpts/mnist_vit_finetune_moe_seed1 \
    --plot-path ./plot
```

---
