# FragFM  

Repository for **FragFM**
 

## Install

To train and generate molecules with **FragFM**, install packages with the below script.

```bash
conda create -n fragfm python=3.11 -y
conda activate fragfm
conda install scipy==1.14.1 numpy==1.26.4 pandas==2.2.3 scikit-learn==1.5.2 -y

pip install torch==2.1.0+cu118 torchvision==0.16.0+cu118 torchaudio==2.1.0+cu118 -f https://download.pytorch.org/whl/torch_stable.html 
pip install torch-scatter==2.0.9 torch-sparse==0.6.15 torch-cluster==1.6.0 torch-geometric==2.1.0.post1 -f https://data.pyg.org/whl/torch-1.11.3+cu113.html


pip install rdkit==2023.9.2 
pip install git+https://github.com/bp-kelley/descriptastorus
pip install wandb==0.18.6 lmdb==1.5.1 pyyaml==6.0.1 easydict==1.13 parmap==1.7.0

pip install -e .
```


## Processing Data 


### Processing the Main Data

Place all input files in the `data/raw` directory.
The processed data will be stored in `data/processed` directory.

```bash
sh process/process_all_data.sh
```


## Training Models

The training consists of two steps, training the autoencoder and training the flow model.
Example configuration files are sotred in `cfgs`



### Training Coarse to Fine Autoencoder

The trained checkpoints will be automatically saved to `save/ae_model`

```bash
python train_ae.py <yaml file>
```


### Trianing Flow Model


The trained checkpoints will be automatically saved to `save/flow_model`


```bash
python train_flow.py <yaml file>
```


## Generation with FragFM

The generated samples will be automatically saved to `save/generate`

```bash
python generate.py <yaml file>
```
