Installing requirements: 
create a new conda environment by running <conda create --name myenv --file requirements.txt>

Ground truth construction:
Code is available in 'ground_truth' directory
Run <python main.py> as it is to produce GEDS for VG-DENSE. 
To run for other datasets: change paths of variables 'large_graphs', 'large_graphs_idx' 
so that they point to appropriate pickle files.

Training GNN:
Code is available in 'gnns/approach_1' directory
- Make output directory if non-existent (i.e. in code/ directory, run <mkdir outs>)
- Make sure you have generated GED pairs to use as labels during similarity-based training
- Properly modify 'config.py', comments are available next to parameters
- Run <python main.py>. After training embeddings and the trained model are dumped in output dir

Graph kernels:
Code is available in 'graph_kernels' directory
Results for PM kernel are provided in the appendix of the paper. 
This script utilizes 5 different kernels (other 4 perform exhibit worse performance).
- Change paths in lines marked with '###' if running on different data subset
- Run <python kernels.py>

Evaluation:
Code is available in 'evaluation' directory
- Change variables 'new_idxs', 'vg_dense_classification', 'geds' in 'eval.py' according to data subset in use
- Provide paths to embeddings or graph kernel matrices (i.e. populate lists 'sims_rank', 'embedding_res')
  Comment out lines if you are not running full evaluation.
  The currect script evaluates best GNN embedding models for the 3 variants and 5 graph kernels.
- Run <python eval.py>

Graph construction:
Code for the construction of graphs for experiments (D/P-SGG, D/P-CAPTION, SMARTY, AG)
is available in 'graph-extraction' directory

All data files are provided in ANONYMIZED drive 
'https://drive.google.com/drive/folders/11frvSzMLZ0PeUx-hGZ6Rp3f3XtPCCYEy?usp=sharing' in pickle form for
full reproducibility. There are also notebooks running the experiments (by ANONYMOUS user).

In this drive, the file 'glove_emb_300.pkl' can also be found (since it is too big to include inside zip).
It can also easily be created as a dict with keys being words and values their embeddings of dimension 300
as produced using GloVe (which is publicly available).



