# 3D Molecule Generation from Rigid Motifs via SE(3) Flows

This code is built on the [FoldFlow](https://github.com/DreamFold/FoldFlow). We thank [Bose et al.](https://arxiv.org/abs/2310.02391) for their awesome work.

## Installation

```bash
# Install micromamba
"${SHELL}" <(curl -L micro.mamba.pm/install.sh)

# From MotiFlow, tnstall dependencies and activate environment
micromamba create -f environment.yaml
micromamba activate motiflow-env
```

## Dataset

For QM9 and GEOM-Drugs datasets, we used the splits and preprocessing from previous works. For QM9, please follow [END](https://github.com/frcnt/equivariant-neural-diffusion) by [Cornet et al.](https://arxiv.org/abs/2506.10532). The preprocessed GEOM-Drugs dataset is available in [GeoDiff](https://github.com/MinkaiXu/GeoDiff) repository. 
The QMugs dataset can be downloaded [here](https://libdrive.ethz.ch/index.php/s/X5vOBNSITAG5vzM). The datasets have to be stored in the `data` folder, and the fragmented datasets are expected in the `fragmented_data` folder.

## Fragmentation

To reproduce the fragmentation described in the paper, one has to first align the order of canonical SMILES strings to the order of atomic coordinates from the dataset. For GEOM-Drugs, this logic is in `fragmentation/geom_dataset_alignment.py`.
The rigid-body motif fragmentation on GEOM-Drugs itself is performed in `fragmentation/geom_fragmentation.py`. For interactivity, one may follow the notebook `fragmentation/geom_fragmentation.ipynb` instead.
For conciseness, we provide the whole fragmentation logic for QM9 in a single notebook `fragmentation/qm9_fragmentation.py`. For QMugs, the logic is identical to GEOM-Drugs.

## Training and Inference

The rest of the code is organised as follows. The data-related code is stored in `motiflow/data`. The main components of the MotiFlow framework, namely, the SE(3) flow matching for rigid-motif frames and discrete flow for rigid-motif classes, is provided in `motiflow/models`. The IPA-based architecture is in `motiflow/models/components`. Various utilities, mostly related to handling of rigid bodies and rotations, are stored in `motiflow/utils`.

To train and sample from MotiFlow, one may use the `runner/train.py` script. We use Hydra to handle the configuration files, one may adjust the parameters of MotiFlow in `runner/config`.
