# Model Fusion of Heterogeneous Neural Networks via Cross-Layer Alignment
This repository is the official implementation of CLAFusion.
![CLAFusion](./clafusion_framework.png)

## Requirements 

We use Python 3.6. To install requirements:

```setup
pip install -r requirements.txt
```

Other than that, the bc command is needed to run bash scripts.

```setup
apt-get install bc
```

## How to use this repository
We prepared bash scripts for running experiments in the paper in folder **sh**. Before running any scripts, **be sure to put the model checkpoints in the correct directory specified in the DATA_PATH variable.** 

You can conduct your experiments by changing the parameters in the corresponding script or creating a new one. Parameter description is available in parameters.py.

## Training

To train the model(s) in the paper, run this command:
* Skill transfer
```
bash sh/train_mlp_p_split.sh
```
* RESNET18 + RESNET34
```
bash sh/train_resnet18_resnet34.sh
```
* Teacher-student setting
```
bash sh/train_mlp_width.sh
bash sh/train_vgg11_half_vgg13_doub.sh
```
* Ablation studies
```
bash sh/train_mlp_ablation_studies.sh
```

## CLAFusion
To fuse the model(s) in the paper, run this command:
* Skill transfer
```
bash sh/fuse_mlp_p_spit.sh
```
* Ablation studies
```
bash sh/align_mlp_ablation_studies.sh
bash sh/fuse_mlp_ablation_studies.sh
```

## CLAFusion + Finetuning
To fuse + finetune the model(s) in the paper, run this command:
* RESNET18 + RESNET34
```
bash sh/fuse_resnet18_resnet34.sh
```
* Teacher-student setting
```
bash sh/fuse_vgg11_half_vgg13_doub.sh
```

## Model transfer
To fuse + run model transfer methods in the paper, run this command:
```
bash sh/transfer_resnet18_resnet34.sh
```

## Knowledge distillation
To run knowledge distillation experiment:
```
bash sh/distill_vgg.sh
bash sh/distill_vgg_small.sh
```

## Pre-trained Models

You can download pre-trained models at the following link, [My pre-trained models](https://bit.ly/3uUe7t4). 
Please do not modify or delete files in this folder.
Please do not share the link with unauthorized people. 

## Results

Our models achieve the average performance (across different seeds) as follows.

| Exp name            | Model A | Model B |
| ------------------- | ------- | ------- |
| Skill transfer      |  90.76  |  87.58  |
| RESNET34 + RESNET18 |  93.31  |  92.92  |
| Teacher-student MLP |  97.42  |  97.33  |
| Teacher-student VGG |  92.65  |  89.72  |
| Ablation study 1    |  96.95  |  97.59  |
| Ablation study 2    |  96.95  |  97.75  |

## Contributing
If you'd like to contribute or have any suggestions, you can contact us at someone@gmail.com or open an issue on this GitHub repository.

All contributions welcome! All content in this repository is licensed under the MIT license.

## Acknowledgement
The structure of this repository is largely based on the official implementation of [Model Fusion via Optimal Transport](https://github.com/sidak/otfusion). For model transfer, we adapt the source code of [Heterogeneous Model Transfer between Different Neural Network](https://anonymous.4open.science/r/6ab184dc-3c64-4fdd-ba6d-1e5097623dfd/a_hetero_model_transfer.py). 