# Code for the ICLR 2025 submission: "GRAIN: Exact Graph Reconstruction from Gradients"
## Prerequisites
- Install Anaconda. 
- Create the conda environment:<br>

> conda env create -f environment.yml -n grain

- Enable the created environment:<br>

> conda activate grain

- Create necessary folders

> mkdir -p models

- Download the [pretrained model](https://mega.nz/file/CNpBFAoQ#BQai6AnWrgxbobIMjLhg5LqjEl5D3xr8MFEao6oX2uU) and the [original model](https://mega.nz/file/qN5zmC7Q#IxwMVI1QeCjcFoett0fz-VYT6JGUoaePBPDBxBUggoEand) and store them in the`\models` folder.


## Adapting the configuration

We provide a configuration file `graph_attack_config.yaml` that should be used for adapting the attack setting and model hyperparameters. Before each attack is run, please configure the file to fit for the experiment you are trying to replicate. Relevant parameters include:

- **DATASET**: The dataset for the attack to be run on. Should be one of `zpn/tox21_srp53`, `zpn/clintox` or `zpn/bbbp`.

- **N_LAYERS_GCN**: The number of GCN layers in the model. Default is 2.

- **HIDDEN_SIZE**, **NODE_EMBEDDING_DIM**, **READOUT_HIDDEN_DIM**: the embedding dimensions for each GCN component, referred to as $d$ in the paper. We set them as equal for the sake of simplicity, but they can be changed separately. Default is 300.

- **ACT**: The activation function used in the GCN. Currently supports `relu` or `gelu`.

- **GRAPH_CLASS**: A boolean tag describing whether the setting is graph or node classification (default is True).

- **model_path**: The path to a pretrained model. Defaults to none.

- **TOL_L**: The singular value tolerance that is used to determine the rank of the gradient at the given layer. We only use the defaults, but they can be updated if needed.

## Running the baselines

We provide a modified verson of DLG and TabLeak inside the `\baseline` folder. They both utilise the aforementioned configuration file to determine the gcn structure.

### Parameters

- **DATASET**: The dataset for the attack to be run on. Should be one of `zpn/tox21_srp53`, `zpn/clintox` or `zpn/bbbp`.

- **CONFIG_PATH**: The path to the configuration file.

### Command

To run DLG:

> ./dlg.sh DATASET --config_path CONFIG_PATH

To run DLG with a given adjacency matrix:

> ./dlg.sh DATASET --config_path CONFIG_PATH --fix_A

To run TabLeak:

> ./tableak.sh --config_path CONFIG_PATH DATASET

To run TabLeak with a given adjacency matrix:

> ./tableak.sh DATASET --config_path CONFIG_PATH --fix_A

Further changes (i.e. to the architecture) can be achieved through the configuration.

## GRAIN Experiments (Tables 1, 2, 3, 4)
### Parameters
- **CONFIG_PATH**: The path to the configuration file. All relevant parameters can be adapted here.

### Commands
There is only a single command, with all other adaptions being done inside the configuration file

> ./attack.sh --config_path CONFIG_PATH

## Fine-tuning GPT-2 with and without defended gradients

In order to fine-tune the GCN using the configuration provided, simply run:

> python3 training_gcn.py --config_path CONFIG_PATH

## General notes

We recommend running the experiments using the [Neptune](https://neptune.ai/) framework, which can be enabled by specifying a workspace in the baseline scripts under the `--neptune` parameter, as this allows for all results to be viewed without printing overhead. 
