Metadata-Version: 2.1
Name: gflownet
Version: 0.1.0
Author-email: Recursion Pharmaceuticals <devs@recursionpharma.com>
Keywords: gflownet
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Requires-Python: <3.11,>=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch==2.1.2
Requires-Dist: torch-geometric==2.4.0
Requires-Dist: torch-scatter==2.1.2
Requires-Dist: torch-sparse==0.6.18
Requires-Dist: torch-cluster==1.6.3
Requires-Dist: rdkit
Requires-Dist: tables
Requires-Dist: scipy
Requires-Dist: meeko
Requires-Dist: numpy
Requires-Dist: networkx
Requires-Dist: tensorboard
Requires-Dist: cvxopt
Requires-Dist: pyarrow
Requires-Dist: gitpython
Requires-Dist: botorch
Requires-Dist: pyro-ppl
Requires-Dist: gpytorch
Requires-Dist: omegaconf>=2.3
Requires-Dist: wandb
Requires-Dist: pandas
Provides-Extra: dev
Requires-Dist: bandit[toml]; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: pip-compile-cross-platform; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: tox; extra == "dev"
Requires-Dist: typeguard; extra == "dev"
Requires-Dist: types-pkg_resources; extra == "dev"
Requires-Dist: gitpython>=3.1.30; extra == "dev"



[![arXiv](https://img.shields.io/badge/arXiv-1234.56789-b31b1b.svg)](https://arxiv.org/abs/2405.01155)
[![Python versions](https://img.shields.io/badge/Python-3.9%2B-blue)](https://www.python.org/downloads/)
[![license: MIT](https://img.shields.io/badge/License-MIT-purple.svg)](LICENSE)

![GFlowNet](docs/synflownet_logo.png)

# SynFlowNet - Design of Diverse and Novel Molecules with Synthesis Constraints

Official implementation of SynFlowNet, a GFlowNet model with a synthesis action space. The paper is available on [arxiv](https://arxiv.org/abs/2405.01155).

**Primer**

SynFlowNet is a GFlowNet model that generates molecules from chemical reactions and available building blocks. SynFlowNet is trained to sample molecules with probabilities proportional to their rewards. This repo contains instructions for how to train SynFlowNet and sample synthesisable molecules. The code builds upon the codebase provided by [recursionpharma/gflownet](https://github.com/recursionpharma/gflownet), available under the [MIT](https://github.com/recursionpharma/gflownet/blob/trunk/LICENSE) license. For a primer and repo overview visit [recursionpharma/gflownet](https://github.com/recursionpharma/gflownet).

![GFlowNet](docs/concept.png)

## Installation

### PIP

This package is installable as a PIP package, but since it depends on some torch-geometric package wheels, the `--find-links` arguments must be specified as well:

```bash
conda create -n synflow python=3.10
conda activate synflow
pip install torch==2.1.2 --index-url https://download.pytorch.org/whl/cu121
pip install -e . --find-links https://data.pyg.org/whl/torch-2.1.2+cu121.html
```
Or for CPU use:

```bash
conda create -n synflow python=3.10 -y
conda activate synflow
pip install torch==2.1.2 --index-url https://download.pytorch.org/whl/cpu
pip install -e . --find-links https://data.pyg.org/whl/torch-2.1.2+cpu.html
```

## Getting started

### Data

The training relies on two data sources: modified _Hartenfeller-Button_ reaction templates and _Enamine_ building blocks. The building blocks are not freely available and can be obtained upon request from [enamine.net/building-blocks/building-blocks-catalog](https://enamine.net/building-blocks/building-blocks-catalog). Instructions can be found in `src/gflownet/data/building_blocks/`.

### Training

The model can be trained by running `src/gflownet/tasks/reactions_task.py` using different reward functions implemented in the same file. You may want to change the default configuration in `main()`.

#### [Optional] If using GPU-accelerated Vina

For easy adoption to other targets, a GPU-accelerated version of Vina docking can be used to calculate rewards as binding affinities to targets of interest. Follow the instructions at [this repo](https://github.com/DeltaGroupNJUPT/Vina-GPU-2.1) to compile an excuteable for `QuickVina2-GPU-2-1`. One done, place the excuteable in `bin/`.

# Citation

If you use this code in your research, please cite the following paper:

```
@article{cretu2024synflownet,
      title={SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints},
      author={Miruna Cretu, Charles Harris, Ilia Igashov, Arne Schneuing, Marwin Segler, Bruno Correia, Julien Roy, Emmanuel Bengio and Pietro Liò},
      journal={arXiv preprint arXiv},
      year={2024}
}
```
