numpy==1.26.3
scipy==1.13.1
networkx==3.2.1
scikit-learn==1.6.1
torch==2.6.0+cu124
torch-geometric==2.6.1
ogb==1.3.6
GraKeL==0.1.10
pandas==2.2.3



\# Graph Metamers: Adversarial Graph Generation Pipeline



This repository implements a framework to study and reveal the invariance properties of Graph Neural Networks (GNNs) by generating \*\*model metamers\*\*—synthetic graphs optimized to match internal activations of a pretrained GNN—while allowing node features and/or structure to vary. This approach exposes over‑invariance in standard GNN architectures and provides quantitative metrics and mitigation strategies.



---



\## 📝 Key Features



\- \*\*Model Metamers for GNNs\*\*: Generate both feature‑based and structure‑based metamers using differentiable relaxation and straight‑through estimators.

\- \*\*Multiple GNN Architectures\*\*: Support for GCN, ChebNet, GraphSAGE, GIN, GAT, Graphormer, and GraphGAN.

\- \*\*Quantitative Metrics\*\*:

&nbsp; - \*\*Feature Consistency\*\*: Cosine similarity \& classification match rate combined into a single score.

&nbsp; - \*\*Structural Consistency\*\*: Weisfeiler–Lehman graph kernel score \& label match rate.

\- \*\*Theoretical Analysis\*\*: Local metamer dimension and activation‑induced volume change derived via Jacobian rank.

\- \*\*Mitigation Strategies\*\*: Architectural tweaks (ELU, residual connections), adversarial training, and hidden dimension scaling to reduce over‑invariance.



---



\## 📁 Repository Structure



```text

├── data/                    # Raw and processed graph datasets (Cora, CiteSeer, PubMed, Squirrel, Chameleon)

├── models.py                # Definitions of two‑layer GNN variants

├── main\_test.py             # Entrypoint: train GNN, generate metamers, evaluate metrics

├── outputs/                 # Checkpoints, heatmaps, logs, and similarity scores

└── README.md                # This file

```



---




\## 🚀 Usage



Run the full pipeline:



```bash

python main\_test.py \[OPTIONS]

```



\### Command‑line Arguments



| Option               | Type  | Default  | Description                                                          |

| -------------------- | ----- | -------- | -------------------------------------------------------------------- |

| `--dataset\_name`     | str   | `PubMed` | One of: `Cora`,`CiteSeer`,`PubMed`,`Squirrel`,`Chameleon`            |

| `--seed`             | int   | `42`     | Random seed                                                          |

| `--model\_type`       | str   | `GCN`    | GNN: `GCN`,`ChebNet`,`GraphSAGE`,`GIN`,`GAT`,`Graphormer`,`GraphGAN` |

| `--target\_layer`     | int   | `1`      | Activation layer to match (1 or 2)                                   |

| `--gen\_mode`         | str   | `feat`   | `adj`,`feat`, or `both`                                              |

| `--hidden\_channels`  | int   | `32`     | Hidden dimension size                                                |

| `--lr\_model`         | float | `0.005`  | Learning rate for GNN                                                |

| `--wd\_model`         | float | `5e-4`   | Weight decay for GNN optimizer                                       |

| `--lr\_gen`           | float | `0.0005` | Learning rate for generator                                          |

| `--num\_epochs\_model` | int   | `10000`  | Max epochs for GNN training                                          |

| `--num\_epochs\_gen`   | int   | `10000`  | Max epochs for metamer generation                                    |

| `--patience1`        | int   | `500`    | Early‑stop patience for metamer loss                                 |

| `--patience2`        | int   | `100`    | Early‑stop patience for GNN training                                 |

| `--threshold`        | float | `0.5`    | Binarization threshold for features                                  |



\*\*Example:\*\*



```bash

python main\_test.py \\

&nbsp; --dataset\_name PubMed \\

&nbsp; --model\_type GCN \\

&nbsp; --gen\_mode both \\

&nbsp; --hidden\_channels 32 \\

&nbsp; --lr\_model 0.01 \\

&nbsp; --lr\_gen 0.0005

```



---



\## 📈 Outputs



\- \*\*Model Checkpoint\*\*: `outputs/model-save.pth`

\- \*\*Heatmaps\*\*:

&nbsp; - `outputs/_x-heatmap.png` — original features

&nbsp; - `outputs/x-gen-heatmap.png` — generated metamer features

&nbsp; - `outputs/difference-heatmap.png` — difference map

---



\## 📊 Evaluation Metrics



1\. \*\*Feature Consistency\*\* (`CS\_feat`): combines cosine similarity \& label match rate.

2\. \*\*Structural Consistency\*\* (`CS\_struct`): WL kernel similarity \& label match rate.



---



\## 🛠 Mitigation Strategies



\- \*\*Activation\*\*: Replace ReLU → ELU to avoid volume collapse.

\- \*\*Adversarial Training\*\*: Improve sensitivity to perturbations.

\- \*\*Residual Connections\*\*: Increase Jacobian rank.

\- \*\*Hidden Dimensionality\*\*: Larger hidden size reduces over‑invariance.





