# Linear Mode Connectivity on ImageNet21k→CIFAR

This guide outlines the complete pipeline for training, fine-tuning, and evaluating **Linear Mode Connectivity (LMC)** for Mixture-of-Experts (MoE) models in a Transfer Learning setting, using pretrained checkpoint on **ImageNet21k** and fine-tuning on the **CIFAR-10** and **CIFAR-100** datasets.

The shell script example below demonstrates the steps for **CIFAR-10**. For **CIFAR-100**, replace the dataset name (`cifar10`) with `cifar100` in the command.

---

## Prerequisites

### Download Pretrained Checkpoint

Download the pretrained Vision Transformer checkpoint (`ViT-S/16`) from Google’s Vision Transformer repository, or use another pretrained model of your choice. Run the following command to download `ViT-S/16`:

```bash
wget https://storage.googleapis.com/vit_models/imagenet21k/ViT-S_16.npz -P weights/
```

---

## Change Working Directory

To begin, navigate to the project directory:

```bash
cd lmc-moe/transfer_learning
```

## Step 1: Prepare the Dataset

The dataset is automatically loaded from `tensorflow.datasets` within the code.

---

## Step 2: Train the Base Model

```bash
CUDA_VISIBLE_DEVICES=0 python -m cifar10.finetune_cifar10
```

---

## Step 3: Fine-tune with a Mixture-of-Experts (MoE) Layer

```bash
CUDA_VISIBLE_DEVICES=0 python -m cifar10.finetune_cifar10_moe \
    --seed 0 \
    --learning-rate 5e-2  \
    --num-layers 12 \
    --moe-layer-which 0 \
    --num-experts 4 \
    --ckpt-path /ckpts \
    --model-path ./pretrain/cifar10_vit_seed0
```

Repeat this step with a different random seed (e.g., `--seed 20`) to generate another fine-tuned model for matching.

---

## Step 4: Perform our proposed Weight Matching algorithm on 2 model checkpoints

```bash
CUDA_VISIBLE_DEVICES=0 python -m cifar10.weight_matching_cifar10_moe \
    --num-layers 12 \ 
    --moe-layer-which 0 \
    --num-experts 4 \
    --model-a /ckpts/cifar10_vit_finetune_moe_seed0 \
    --model-b /ckpts/cifar10_vit_finetune_moe_seed1 
```

---
