# SeedGNN: Graph Neural Networks for Supervised Seeded Graph Matching

This repository is the implementation of "SeedGNN: Graph Neural Networks for Supervised Seeded Graph Matching".

## Requirements

* Python (>=3.8)
* PyTorch (>=1.2.0)
* PyTorch Geometric (>=1.5.0)
* Numpy (>=1.20.1)
* Scipy (>=1.6.2)
* seaborn (0.11.2)
* graspologic (>=1.0.0)


## Preparing Data

The data of facebook networks used in our paper can be downloaded [here](https://archive.org/download/oxford-2005-facebook-matrix/facebook100.zip). 
Then, unzip the downloaded file and put the folder 'facebook100' under the folder './data'.

The Shrec'16 dataset can be downloaded [here](https://vision.in.tum.de/~laehner/shrec2016/files/TOPKIDS_lowres.zip). 
Then, unzip the downloaded file, rename the folder 'low resolution' as 'low_resolution', and put this folder under the folder './data'.

## Training

To train the model(s) in the paper, run this command:

```
python train.py
```

Then, a file of the trained model, named 'SeedGNN-model-trained.pth', will be generated and stored in the folder './model'.

## Evaluation

* To evaluate our model on ER graphs and generate the results for Fig.5 in our paper, run:

```
python TestER.py
```
    
Our SeedGNN achieves the following matching accuracy (%) on sparse ER graphs (_n = 500, p=0.01, s=0.8_):
    
| Fraction of Seeds |   0%  |  2%  | 4%  | 6%  | 8%  | 10% | 12% | 14% | 16% | 18% | 20% |
| ----------------- |------- | ----- | ----- |----- |----- |----- |----- |----- |----- |----- |-----|
|      SeedGNN      |  0.3 | 15.1 | 47.4 | 82.8 | 96.0 | 96.6 | 97.0 | 97.6 | 97.6 | 97.6 | 97.6 |
|    1-hop (_T=6_)  |  0.2 |  1.5 | 2.89 | 4.91 | 6.0 | 9.2 | 12.5 | 15.3 | 19.7 | 23.6 | 32.3|
|    2-hop (_T=3_)  |  0.2 | 2.4 | 18.0 | 57.9 | 81.1 | 92.1 | 95.8 | 96.4 | 96.4 | 96.7 | 97.0 |
|    3-hop (_T=2_)  |  0.3 | 2.4 | 7.1 | 29.7 | 64.9 | 90.8 | 96.0 | 97.1 | 97.2 | 97.4 | 97.5 |
|           PGM     |  0.2 | 2.3 | 6.1 | 16.3 | 31.6 | 54.5 | 73.3 | 79.2 | 86.3 | 88.9 | 92.7 |
|        SGM        |  0.3 | 3.6 | 8.9 | 13.8 | 22.3 | 36.3 | 54.5 | 67.3 | 84.4 | 89.6 | 91.6 |
|        MGCN       |  0.1 | 2.0 | 4.0 | 6.7 | 8.4 | 11.1 | 12.4 | 14.0 | 16.3 | 18.9 | 20.5 |

Our SeedGNN achieves the following matching accuracy (%) on dense ER graphs (_n = 500, p=0.2, s=0.8_):
    
| Fraction of Seeds |  0.0 | 0.5% | 1% | 1.5% | 2% | 2.5% | 3% | 3.5% | 4% | 4.5% | 5%|
| ----------------- |------- | ----- | ----- |----- |----- |----- |----- |----- |----- |----- |-----|
|      SeedGNN      |  0.1 | 0.7 | 91.4 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
|    1-hop (_T=6_)  |  0.1 | 0.7 | 2.3 | 7.4 | 95.0 | 100 | 100 | 100 | 100 | 100 | 100 |
|    2-hop (_T=3_)  |  0.0 | 0.7 | 2.2 | 5.6 | 46.6 | 100 | 100 | 100 | 100 | 100 | 100 |
|    3-hop (_T=2_)  |  0.1 | 0.2 | 0.38 | 0.6 | 0.4 | 0.54 | 0.9 | 1.3 | 1.2 | 1.4 | 2.1 |
|         PGM       | 0.1 | 0.6 | 1.8 | 4.3 | 19.3 | 51.2 | 96.6 | 100 | 100 | 100 | 100 |
|        SGM        |  0.2 | 1.5 | 85.8 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
|        MGCN       |  0.1 | 0.7 | 1.5 | 1.9 | 3.7 | 5.2 | 6.9 | 8.0 | 10.9 | 12.3 | 13.7 |

Note that we use the public implementations of [MGCN](https://github.com/sheldonresearch/MGCN) in our code.

----------------------------------------
* To evaluate our model on the Shrec'16 dataset and generate the results for Fig.7 in our paper, run:

```
python TestShrec16.py
```
Our SeedGNN achieves the following matching accuracy (%) on the Shrec'16 dataset:
    
| Fraction of Seeds |   0%  |  1%  | 2%  | 3%  | 4%  | 5% | 6% | 7% | 8% | 9% | 10% |
| ----------------- |------- | ----- | ----- |----- |----- |----- |----- |----- |----- |----- |-----|
|      SeedGNN      |  0.6 | 43.0 | 80.7 | 94.2 | 94.2 | 94.2 | 94.2 | 94.2 | 94.2 | 94.5 | 94.7 |
|    1-hop (_T=6_)  |  0.0 | 0.5 | 0.7 | 1.7 | 4.0 | 4.1 | 4.1 | 4.9 | 7.6 | 9.4 | 10.2  | 
|    2-hop (_T=3_)  |  0.0 | 27.2 | 58.4 | 74.5 | 84.7 | 89.4 | 92.0 | 94.0 | 94.2 | 94.2 | 94.6 |
|    3-hop (_T=2_)  |  0.0 | 31.0 | 56.1 | 69.3 | 82.0 | 85.9 | 90.1 | 91.58 | 92.8 | 92.6 | 93.5|
|           PGM     |  0.0 | 18.4 | 41.0 | 55.2 | 67.9 | 77.3 | 82.8 | 88.7 | 89.6 | 90.7 | 91.7 |
|        SGM        |  0.0 | 16.2 | 31.3 | 53.1 | 64.5 | 72.6 | 80.2 | 85.0 | 90.8 | 90.7 | 93.0 |
|        MGCN       |  0.0 | 0.5 | 0.7 | 1.7 | 4.1 | 4.3 | 4.7 | 4.9 | 7.6 | 9.4 | 10.2 |

Note that we use the public implementations of [MGCN](https://github.com/sheldonresearch/MGCN) in our code.

The Table 1 in our paper is generated by using the public implementations of  [CrossMNA](https://github.com/ChuXiaokai/CrossMNA), [MGCN](https://github.com/sheldonresearch/MGCN), [GMN, PCA-GM](https://github.com/stones-zl/PCA-GM), [DGMC](https://github.com/rusty1s/deep-graph-matching-consensus), [BB-GM](https://github.com/martius-lab/blackbox-deep-graph-matching), [DGM](https://github.com/Zerg-Overmind/QC-DGM) on the Shrec'16 dataset.

----------------------------------------

* To evaluate our model on Facebook networks and generate the results for Fig.6 in our paper, run:

```
python TestFacebook.py
```
Our SeedGNN achieves the following matching accuracy (%) on Facebook networks: 
 
| Fraction of Seeds |  0% | 0.5% | 1% | 1.5% | 2% | 2.5% | 3% | 3.5% | 4% | 4.5% | 5%|
| ----------------- |------- | ----- | ----- |----- |----- |----- |----- |----- |----- |----- |-----|
|      SeedGNN      | 1.6 | 2.7 | 22.0 | 78.1 | 83.9 | 84.1 | 85.3 | 84.0 | 83.7 | 84.4 | 84.9|
|    1-hop (_T=6_)  | 1.0 | 1.8 | 6.3 | 8.3 | 32.1 | 56.0 | 71.6 | 81.3 | 83.8 | 83.8 | 84.3|
|    2-hop (_T=3_)  | 1.7 | 3.8 | 9.4 | 15.3 | 47.3 | 67.8 | 76.8 | 80.1 | 82.5 | 82.7 | 83.8|
|    3-hop (_T=2_)  | 0.6 | 2.2 | 5.6 | 8.2 | 15.6 | 26.4 | 34.9 | 49.4 | 69.5 | 71.5 | 73.6    |
|         PGM       | 0.6 | 2.4 | 5.1 | 7.9 | 17.8 | 31.1 | 44.3 | 60.0 | 75.9 | 77.8 | 77.4 | 
|        SGM        | 2.2 | 3.5 | 17.4 | 73.1 | 79.5 | 83.8 | 83.6 | 83.9 | 83.7 | 83.3 | 83.9 | 
|        MGCN       | 0.2 | 0.7 | 1.6 | 6.2 | 13.5 | 19.1 | 25.3 | 36.8 | 54.9 | 67.4 | 77.6 |

Note that we use the public implementations of [MGCN](https://github.com/sheldonresearch/MGCN) in our code.

----------------------------------------

* To evaluate our model on the Willow Object dataset, run:

```
python TestWillow.py
```

Our SeedGNN achieves the following matching accuracy (%) on Willow Object dataset: 

|       Method      |  face | mbike | car | duck | wbottle | Mean |
| ----------------- |------- | ----- | ----- |----- |----- |----- |
|      SeedGNN      | 100.0 |  98.9 | 98.0 | 93.1 | 98.7 | 97.7 |
|    DGMC+SeedGNN   | 100.0 | 99.6 | 100.0  | 99.7 | 99.2 | 99.5 |

For the semi-supervised algorithms, we use the publicly available implementations from their respective papers to generate the corresponding matching results.  
The results of the existing supervised algorithms are directly retrieved from their respective papers.

----------------------------------------


* To verify the effectiveness of our design choices and generate the results for Fig.8 in our paper, run:

```
python TestDesign.py
```

The variants of SeedGNN achieve the following matching accuracy (%) on ER graphs (_n = 500, p=0.04, s=0.8_):
    
| Fraction of Seeds |  0% | 0.5% | 1% | 1.5% | 2% | 2.5% | 3% | 3.5% | 4% | 4.5% | 5%|
| ----------------- |------- | ----- | ----- |----- |----- |----- |----- |----- |----- |----- |-----|
|      SeedGNN      |  0.6 | 1.0 | 13.4 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
|     SeedGNN-Hun   |  0.0 | 1.2 | 10.2 | 55.2 | 80.2 | 89.4 | 95.0 | 100 | 100 | 100 | 100 |
|     SeedGNN-per   |  0.2 | 0.4 | 1.6 | 2.4 | 10.6 | 45.4 | 96.4 | 100 | 100 | 100 | 100 |
|     SeedGNN-van   |  0.0 | 0.8 | 1.0 | 1.6 | 2.0 | 2.4 | 3.2 | 3.6 | 4.0 | 4.8 | 5.0 | 
|     SeedGNNx      |  0.0 | 0.4 | 2.4 | 2.8 | 7.8 | 8.8 | 20.8 | 29.0 |  100 | 100 | 100 | 

----------------------------------------


* To verify the necessary training samples for generalization and generate the results for Fig.9 in our paper, run:

```
python TestTrainingSet.py
```

Our SeedGNN trained with different training sets achieve the following matching accuracy (%) on  ER graphs:


Fix _n=500, s=0.8, theta=0.05_:

| Graph Sparsity _p_ |  0.02 | 0.04 | 0.06 | 0.08 | 0.10 | 0.12 | 0.14 | 0.16 | 0.18 | 0.2 |
| --------------- |------- | ----- | ----- |----- |----- |----- |----- |----- |----- |----- |
|      SeedGNN      |  100 |  100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
|     SeedGNN p1   |  99.6 | 100 | 100 | 98.2 | 12.4 | 5.8 | 5.2 | 5.4 | 5.2 | 5.0 |
|     SeedGNN p2   |  62.2 | 86.2 | 100 |100| 100 |100|  100 | 100 | 100 | 100 |

Fix _n=500, p=0.08, theta=0.05_:

| Graph Correlation _s_ |  0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1.0 | 
| --------------- |------- | ----- | ----- |----- |----- |----- |
|      SeedGNN      | 88.6 | 100 | 100 | 100 | 100 | 100 |
|     SeedGNN s1   |  14.4 | 34.8 | 74.4 | 100 | 100 | 100 | 
|     SeedGNN s2   | 39.6 | 62.4 | 100 | 100 | 100 | 100 | 
|     SeedGNN s3   | 91.2 | 100 | 99.2 | 74.4 | 46.0 | 11.2 |

Fix _n=500, p=0.04, s=0.8_:

| Fraction of Seeds |  0% | 0.5% | 1% | 1.5% | 2% | 2.5% | 3% | 3.5% | 4% | 4.5% | 5%|
| ----------------- |------- | ----- | ----- |----- |----- |----- |----- |----- |----- |----- |-----|
|      SeedGNN      |  0.6 | 1.0 | 11.4 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
|     SeedGNN t1   |  0.0 | 1.4 | 5.0 | 18.6 | 76.8 | 81.0 | 87.0 | 94.2 | 98.4 | 100 |  100 |
|     SeedGNN t2   |  0.0 | 1.2 | 11.6 | 99.1 |  100 | 100 | 100 | 100 | 100 | 100 | 100|

----------------------------------------

* To generate the layer-wise matching results for Fig.10 and Fig.11 in our paper, run:

```
python Simmatrix.py
```

All the subfigures in Fig.7 and Fig.8 will be generated.

## Pre-trained Models

The pre-trained model can be found in the folder './model'.


