
We are making our entire experiment checkpoints publicly available to contribute to the community's research on the topic of Mixture of Experts (MoE). By reusing our checkpoints at the **Pre-Training** and **Pre-FineTuning** stages, we hope to help others save time and computational resources in their own experiments.

|      Method       |       Stage       | Siglip 224 + Phi3.5                                                                                          | Siglip 224 + Phi3                                                                                     | CLIP 336 + Phi3                                                                                     |
|:-----------------:|:-----------------:|:-----------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------:|
| **Pre-Training**  |                   | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/pretrain)                               | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/pretrain)                           | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/pretrain)                                |
| **Pre-FineTuning**|                   | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/pft)                                    | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/pft)                               | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/pft)                           |
| **VIT 665K**      | SMoE-R            | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft_full/smoe)                          | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft_full/smoe)                     | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/sft_full/smoe)                      |
|                   | Cosine-R          | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft_full/smoe_cosinegating)             | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft_full/smoe_cosinegating)        | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/sft_full/smoe_cosinegating)         |
|                   | Sigmoid-R         | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft_full)                               | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft_full)                          | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/sft_full)                           |
|                   | Hyper-R           | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft_full/hyperrouter)                   | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft_full/hyperrouter)              | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/sft_full/hyperrouter)               |
|                   | Perturbed Cosine-R| [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft_full/smoe_perturbed)                | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft_full/smoe_perturbed)           | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/sft_full/smoe_perturbed)            |
| **VIT 332K**      | SMoE-R            | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft/smoe)                               | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft/smoe)                          | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/sft/smoe)                           |
|                   | Cosine-R          | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft/smoe_cosinegating)                  | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft/smoe_cosinegating)             | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/sft/smoe_cosinegating)              |
|                   | Sigmoid-R         | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft)                                   | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft)                               | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft)                              |
|                   | Hyper-R           | [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft/hyperrouter)                        | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft/hyperrouter)                   | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/sft/hyperrouter)                    |
|                   | Perturbed Cosine-R| [Link](https://huggingface.co/Fsoft-AIC/Phi3.5-Siglip-MoE/tree/main/sft/smoe_perturbed)                     | [Link](https://huggingface.co/Fsoft-AIC/Phi3-SigLiP-MoE/tree/main/sft/smoe_perturbed)                | [Link](https://huggingface.co/Fsoft-AIC/Phi3-CLIP-MoE/tree/main/sft/smoe_perturbed)                 |
