# EvoEGF-Mol

EvoEGF-Mol is a generative flow framework for structure-based drug design that operates on exponential-family manifolds. EvoEGF-Mol supports pocket-specific molecule generation for de novo and lead optimization tasks.

Technical details and evaluation results are provided in our paper:
* [EvoEGF-Mol: Evolving Exponential Geodesic Flow for Structure-based Drug Design](comming soon)


<!-- <p align="center">
    <img src="resources/workflow.png" width="600"/>
</p> -->



## Table of Contents
- [EvoEGF-Mol](#evoegf-mol)
  - [Table of Contents](#table-of-contents)
  - [Installation](#installation)
  - [Prepare Dataset](#prepare-dataset)
  - [Model weights](#model-weights)
  - [Training](#training)
  - [Inference](#inference)
  - [Evaluation](#evaluation)
  - [License](#license)
  - [Citation](#citation)


## Installation
You can build the environment (Default CUDA version is 12.4) using:
```
./setup_env.sh
```
To activate the environment, run:
```
conda activate EvoEGF-Mol
```

## Prepare Dataset
(comming soon).



## Model weights
Download the pretrained checkpoint and config from [Google Drive](comming soon) whose filenames are `pretrained.ckpt` and `config.yaml`, and put it into `./weights` folder as follows. You can use the pretrained weight for inference.
- 📂 weights
    - 📂 checkpoints
        - 📄 pretrained.ckpt
    - ⚙️ config.yaml



## Training
To train the model, firstly make sure you have prepared the dataset according to `Prepare Dataset`, and put it in the right folder. it is _optional_ to modify `./configs/default.yaml`. After this, you can run:
```
python train.py
```
And you will get the intermediate results and the checkpoints in `./logs`.


## Inference
As an example (Make sure checkpoints are put in the right folder), you can run :
```
python sample_for_pocket.py --ckpt_path weights/checkpoints/pretrained.ckpt
```
And you can get results in `./example/BSD_ASPTE_1_130_0/test`
&nbsp;

To generate molecules for de novo task targeting specified protein pocket, run:
```
python sample_for_pocket.py --protein_path $protein_path --ligand_path $ligand_path --ckpt_path $ ckpt_path --out_fn $out_fn
```
And you will get the results in `$out_fn`.

To generate molecules for lead optimization task targeting specified protein pocket, you need to specify an additional parameter `fix_index` to indicate the indices of the fixed atoms for the ligand, which can be determined using `./test/get_ligand_index.py`. Then run:
```
python sample_for_pocket.py --protein_path $protein_path --ligand_path $ligand_path --ckpt_path $ ckpt_path --out_fn $out_fn --fix_index $fix_index --attachment_atoms $attachment_atoms --min_add_num $min_add_num  
```
And you will get the results in `$out_fn`. (`$attachment_atoms` is used to specify the anchor, and remove atoms added at undesired positions)


## Evaluation
(comming soon).



## License
This project is licensed under the terms of the GPL-3.0 license.


## Citation
```

```
