# GraphK

After training, generated graphs will be saved under `generated_figures` folder. In order to show you some examples, we already saved some.

## Installation
We recommend you to use a conda environment (cuda version is 12.4). You can follow the instruction to setup the library.

1. Create a conda environment
```bash
conda create -n graphk python=3.12
```

2. After activating environment, install PyTorch, PyTorch Geometric and other dependencies using pip 
```bash
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu124
```

```bash
pip install torch_geometric
```

```bash
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.5.0+cu124.html
```

3. Install other libraries
```bash
pip install -r requirements.txt
```

You can train the GraphK model by simply calling the main function as:

```python
python main.py
```

Important parameters in `config.yaml`:
- **graph:** community2, community3, protein, citeseer
- **encoder_model:** node2vec, gvae
- **sampler_model:** gmm (Gaussian Mixture Model), dpgmm (Dirichlet Process Gaussian Mixture Model)
- **decoder_model:** dot_product, mlp

- **n_samples:** auto, integer (if auto, it will generate graphs with equal size as input graph)
- **decoder.{decoder_model}.threshold:** this value directly affects the density of graph
- **construction.method:** null, random, kdtree
- **topk:** auto, integer (if auto, k will be chosen based on average degree)

Since training does not take much time, you can play around these values and visualize results.

## Reproducibility
- For communities, we use p=1, q=0.1
- For others, we use p=1, q=0.5