# Expert Merging

Official implementation of Expert Merging. This project learns to combine multiple domain-specialized expert LLMs into a single LLMs via Unsupervised Expert Alignment and Importance-Guided Layer Chunking.

## Getting Started

### 1. Clone/Download the repository
```bash
# git clone ${repo_url}
cd ExpertMerging
```

### 2. Create and activate an environment
```bash
conda create -n expert-merging python=3.10 -y
conda activate expert-merging
pip install -U pip
pip install -r requirements.txt
```

### 3. Download pretrained experts
Follow the checkpoint download guide from WUDI v2 (Yongxian Wei et al. Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging. arXiv 2025). ExpertMerging expects the same base model and task experts with the identical folder layout, so you can place the downloaded directories under a cache path of your choice (e.g., `/mnt/data/cache`). Either pass `--cache_dir /mnt/data/cache` when running the CLI or update `.env` to include `CACHE_DIR="/mnt/data/cache"`.

### 4. Run
```bash
cd expert_merging
python model_merging.py \
  --method expert_merging \
  --cache_dir /mnt/data/cache \
  --data_dir ../dataset
```

### 5. Outputs and Logging
- `results/logs/<method>/<run_name>/merge.log` collects rich-formatted logs (console + file).
- `results/logs/<method>/<run_name>/config.json` records all CLI arguments with a timestamp.
- `results/logs/<method>/<run_name>/model/` stores the merged checkpoint and tokenizer.
- TensorBoard logs for Expert Merging live under the same directory (created by `accelerate`).