# On Linear Mode Connectivity of Mixture-of-Experts Architectures


[![Documentation](https://img.shields.io/badge/docs-passing-brightgreen)](https://github.com/repo/docs)
[![Paper](https://img.shields.io/badge/arXiv-XXXX.XXXXX-blue)](https://arxiv.org/abs/XXXX.XXXXX)

This repository accompanies the paper:
***“On Linear Mode Connectivity of Mixture-of-Experts Architectures”*** (Neurips 2025 Submission)
<p align="center"><strong>ImageNet: Linear Mode Connectivity</strong></p>
<p align="center">
  <img src="plots/imagenet/imagenet_lmc.png" width="500px"/>
</p>


## Installation

```bash
git clone https://github.com/repo/lmc-moe.git
cd moe-lmc
pip install -e .
pip install -r requirements.txt
```

## Repository Structure

```bash
src/
├── agnews/               # Appendix experiment: Reinit FFN
├── cifar10/              # Main experiment
├── cifar100/             # Main experiment
├── dbpedia/              # Appendix experiment: Reinit FFN
├── enwik8/               # Appendix experiment: Reinit FFN
├── imagenet/             # Main experiment
├── imdbreview/           # Appendix experiment: Reinit FFN
├── lm1b/                 # Main experiment
├── mnist/                # Main experiment
├── penn/                 # Appendix experiment: Reinit FFN 
├── transfer_learning/    # Main experiment
├── wikitext103/          # Main experiment
├── datasets.py
├── utils.py
├── weight_matching.py
└── online_stats.py
```

Each dataset directory includes a standalone `README.md` with detailed steps for data preparation, training, and evaluation.


## Linear Mode Connectivity Results

###  ImageNet, WikiText103, One Billion Word (lm1b)



<p align="center"><strong>WikiText103: Linear Mode Connectivity</strong></p>
<p align="center">
  <img src="plots/wikitext103/wikitext_lmc.png" width="500px"/>
</p>

<p align="center"><strong>One Billion Word (LM1B): Linear Mode Connectivity</strong></p>
<p align="center">
  <img src="plots/lm1b/lm1b_lmc.png" width="500px"/>
</p>


## Getting Started

Each dataset experiment can be run individually. See the corresponding `src/<dataset>/README.md` for configuration options.


## Citation

If you find this work helpful, please consider citing:

```bibtex
@article{our2025moelmc,
  title={On Linear Mode Connectivity of Mixture-of-Experts Architectures},
  author={Coauthors},
  journal={arXiv:XXXX.XXXXX},
  year={2025}
}
```


## Acknowledgements

We thank contributors and maintainers of open-source libraries including PyTorch, JAX, Flax, and HuggingFace Transformers. Special thanks to the authors of recent works on LMC and MoE architectures for foundational insights.


## Contributing

We welcome pull requests and suggestions. Please ensure new features or bug fixes include tests where appropriate and follow existing code style.


## License

This project is licensed under the MIT License.

