# Graph Diffusion that can Insert and Delete

This implementation is based on MiDi' code, which can be found at 
https://github.com/cvignac/MiDi

We suggest you regularly check the repository for updates.

## Environment installation

  - Download anaconda/miniconda if needed
  - Create a rdkit environment that directly contains rdkit:
    
    ```conda create -c conda-forge -n griddd rdkit=2023.03.2 python=3.9```
  - `conda activate griddd`
  - Check that this line does not return an error:
    
     ``` python3 -c 'from rdkit import Chem' ```
  - Install the nvcc drivers for your cuda version (this is the tricky part and you may have to do extra steps depending on your hardware). For example:
    
    ```conda install -c "nvidia/label/cuda-11.8.0" cuda```
  - Install a corresponding version of pytorch, for example: 
    
    ```pip3 install torch --index-url https://download.pytorch.org/whl/cu118```
  - Install other packages using the requirement file: 
    
    ```pip install -r requirements.txt```

  - Run:
    
    ```pip install -e .```

  - Install PSI4 using the instructions you will find here: https://psicode.org/installs/v191/

## Running the code (GrIDDD)
On QM9, run this command first to download and process the dataset correctly (you will need to run this only once, at dataset download). The training itself will return an error, this is normal.

``` python3 main.py general.name="GrIDDD" +experiment=qm9_no_h_adaptive.yaml features.use_ins_del=True features.charges_policy="partial" features.use_3d=True features.use_charges=True train.n_epochs=2```

For a generic training on QM9, the "base command" is the following:

``` python3 main.py general.name="DiGress_qm9" +experiment=qm9_no_h_adaptive.yaml features.charges_policy="partial"```

For a generic training on ZINC-250k, replace experiment and charges_policy with 

- ```+experiment=zinc250k.yaml features.charges_policy="dictionary"```

Extra possible commands (which can also be used for FreeGress and GrIDDD):

 - ```general.name``` is the name of the model (you do not have to use the names used above)
 - ```general.gpus```: GPU ID. If you want to train using multiple GPUs, pass a list of IDs (es: general.gpus=[0,1])
 - ```general.check_val_every_n_epochs```: the model will perform a validation step once every check_val_every_n_epochs steps
 - ```general.sample_every_val```: the model will perform a validation sampling every sample_every_val validation steps (which does not mean that it will perform validation once every sample_every_val steps)
 - ```general.samples_to_generate```: the model will sample samples_to_generate graphs during a validation sampling step
 - ```general.wandb```: "online" if you want to use wandb, otherwise "disabled"
 - ```general.resume```: if you did interrupt the training and you want to resume it from a saved checkpoint, place here the checkpoint's path
- ```train.n_epochs```: The number of training epochs
- ```train.batch_size```: The training batch size. NOTE: when testing on property targeting and/or optimization, set it to one (you can set it to whatever you want during training)
- ```train.lr```: the learning rate
- ```train.diffusion_steps```: the parameter T
- ```model.n_layers```: number of layers in the main model (GrIDDD's model used to predict the number of DEL* has a different parameter)
- ```model.lambda_train```: contains the hyperparameters that govern how much each term should influence the loss (lambda_X, lambda_E in the paper). lambda_s and lambda_DEL* are set somewhere else
- ```model.nu```: sets the nu hyperparameter for the noise schedulers
- ```train.hidden_mlp_dims``` sizes of the linear layers used to transform the inputs and outputs of the graph transformer network
- ```train.hidden_dims``` sizes of the hidden layers used to transform the inputs and outputs of the graph transformer network. It also controls the number of attention heads and linear layer size inside the attention 
- ```features.sampling_nT```: if > -1, the model will always start with latents G^T with this size, instead of sampling it from the node's marginal distribution.

To train an instance of FreeGress, use the following:

``` python3 main.py general.name="FreeGress_qm9" +experiment=qm9_no_h_adaptive.yaml features.charges_policy="partial" guidance.p_uncond=0.1 guidance.s=2 guidance.guidance_target=["mu"]```

The extra commands are the following, which can also be used while training GrIDDD:

 - ```guidance.guidance_target``` should be a list of all the target properties (for instance, if we were to use both homo and mu instead, we would have ```guidance.guidance_target=["mu","homo"]```
 - ```guidance.p_uncond```: the p_uncond hyperparameter (rho in FreeGress' paper)
 - ```guidance.s```: the lambda hyperparameter, which controls how much the guidance should influence the model (not really required during training unless you want to sample during validation)
 - ```guidance.guidance_medium```: we did not test it with different values, but you can use it to choose with which input parts the guidance vector is propagated with (X, E, y)

To train an instance of GrIDDD, the main parameters are the following:

```python3 main.py general.name="GrIDDD" +experiment=qm9_no_h_adaptive.yaml features.use_ins_del=True```

The extra commands are:

 - ```zeta_D``` the D hyperparameter of the function \zeta (on a scale from one to zero, where one=T)
 - ```zeta_w``` the w hyperparameter of the function \zeta
 - ```features.node_p_max```: p_max in the paper
 - ```features.node_p_min```: p_min in the paper
 - ```features.max_n```: the maximum graph size that a latent can assume
 - ```features.s_loss_lambda```: lambda_s in the paper
 - ```features.delt_loss_lambda```: lambda_delt in the paper (note that this parameter is theoretically useless since delt is an independent model)
 - ```features.freeze_n_nodes_at_sampling```: disables insert and delete during sampling
 - ```model.n_layers_delt```: n_layers' counterpart for the model specialized to predict the number of DEL*
 - ```model.hidden_mlp_dims_delt```: hidden_mlp_dims' counterpart for the model specialized to predict the number of DEL*
 - ```model.hidden_dims_delt```: hidden_dims' counterpart for the model specialized to predict the number of DEL*


To test a model, add the following parameter: ```general.test_only="<path to the checkpoint>"```

The parameters that can be used during unconditional testing are:

 - ```general.final_model_samples_to_generate```: how many samples to generate during testing
 - ```general.final_model_samples_to_save```: how many samples to save during testing, among the ones generated

The parameters that can be used during property targeting are:

 - ```guidance.n_test_molecules_to_sample```: number of property vectors to extract and target to during testing
 - ```guidance.n_samples_per_test_molecule```: number of molecules to generate for each property vector

The parameters that can be used during property optimization are:

 - ```guidance.experiment_type``` "mae" for property targeting, or "optimization"
 - ```guidance.improvement_type```: "free": the target is the test property vector + guidance.improvement_threshold (used for the LogP optimization). Otherwise "fixed" always sets the property vector to guidance.improvement_target (used for QED / DRD2)
 - ```guidance.improvement_threshold```: if guidance.improvement_type is set to "free", sets the target vector's value (for instance, 0.93 for QED and 0.6 for DRD2)
 - ```guidance.improvement_target```: 0.75  # conditioning target when improvement_type = 'fixed'
 - ```guidance.improvement_limits```: {'qed':[0.9, 1], 'drd2': [0.5, 999999999]}: contains the limits in which the property needs to be within when guidance.improvement_type = "fixed"
 - ```guidance.corruption_step```: how many steps to corrupt
 - ```guidance.similarity_threshold```: minimum Tanimoto similarity score required between the original molecule and the final one