# Neural Energy Minimization for Molecular Conformation Optimization

This repository is the official implementation of Neural Energy Minimization for Molecular Conformation Optimization.

## Requirements

To install the requirements:

```bash
conda env create -f environment.yml
```
You may need to install [pytorch](https://pytorch.org/), 
[pytorch-geometric](https://pytorch-geometric.readthedocs.io/en/latest/) and
[dgl](https://docs.dgl.ai/en/0.6.x/index.html) by yourself to make it compatible with your CUDA version.

## Data and Pre-trained Models

The processed data and the pre-trained models can be found in this [folder](https://www.dropbox.com/sh/zh6eyp3z0ryjgxj/AAC1Tx-id7o07eO-Z6Ce4k6ya?dl=0). You can also download them via the following command:

```bash
# for dataset
wget -O data.zip 'https://www.dropbox.com/sh/bcidyj2mbgy5dp2/AAB_lXSjadWI1wUk6WZgLEBGa?dl=1' 
unzip data.zip -d data

# for pre-trained models
wget -O pretrained_models.zip 'https://www.dropbox.com/sh/cq6ho0imyynkfpg/AACq0GW_auRdLXAIQicnG56wa?dl=1' 
unzip pretrained_models.zip -d pretrained_models
```

## Training

To train a conformer optimization model:

```bash
python train_conf.py --config configs/{qm9, drug}_default.yml --model_type {equi_se3trans, egnn, ours_o2, ours_o3}
```

To train a property prediction model:

```bash
python train_prop.py --config configs/qm9_prop_default.yml --model_type ours_o2 --target_name homo
```

## Evaluation

To evaluate the conformer optimization model:

```bash
python eval_conf.py --ckpt_path <model_path> --test_dataset <dataset_path>
```

For example, with data and pretrained model prepared (see above), you can run the following command to evaluate our two-atom model on the QM9 dataset:

```bash
python eval_conf.py --ckpt_path pretrained_models/conf_opt/qm9_our_o2 --test_dataset data/qm9/qm9_test.pkl
```

To evaluate the property prediction model:
```bash
python eval_prop.py --ckpt_path pretrained_models/prop_pred_with_gt/qm9_homo
```

You can also dump optimized conformers and reproduce the results with error bars reported in the paper with:
```bash
python dump_confs.py \
  --test_dataset data/qm9/qm9_test.pkl \
  --ckpt_path_list pretrained_models/conf_opt/qm9_our_o2 \
  --dump_dir dump_confopt_results \
  --filter_pos False --rdkit_pos_mode all
```

To evaluate the conformer generation model:
```bash
python eval_sampling.py --ckpt_path <model_path> --eval_propose_net_type random --eval_noise 0.028
```
or:
```bash
python eval_sampling.py --ckpt_path <model_path> --eval_propose_net_type online_rdkit --eval_noise 0.
```

## Results

Baseline models' and our models' performances on the QM9 and GEOM-Drug datasets:

QM9:

| Model name         |      mean RMSD      |      median RMSD     |
| ------------------ |-------------------- | -------------------- |
| RDKit+MMFF         |  0.3872 +/- 0.0029  | 0.2756 +/- 0.0075    |
| SE(3)-Trans.       |  0.2476 +/- 0.0021  | 0.1657 +/- 0.0022    |
| EGNN               |  0.2101 +/- 0.0009  | 0.1356 +/- 0.0013    |
| Ours-TwoAtom       |  0.1415 +/- 0.0004  | 0.0534 +/- 0.0002    |
| Ours-Ext_v         |  0.1383 +/- 0.0005  | **0.0505** +/- 0.0001 |
| Ours-ThreeAtom     |  **0.1374** +/- 0.0004 | 0.0522 +/- 0.0002 |


Drug:

| Model name         |      mean RMSD      |      median RMSD     |
| ------------------ |-------------------- | -------------------- |
| RDKit+MMFF         |  1.7913 +/- 0.0030  | 1.6433 +/- 0.0097    |
| SE(3)-Trans.       |  1.0050 +/- 0.0022  | 0.9139 +/- 0.0041    |
| EGNN               |  1.0405 +/- 0.0018  | 0.9598 +/-  0.0038   |
| Ours-TwoAtom       |  0.8839 +/- 0.0014  | 0.7733 +/- 0.0026    |
| Ours-Ext_v         |  0.8691 +/- 0.0015  | 0.7535 +/- 0.0028    |
| Ours-ThreeAtom     | **0.8567** +/- 0.0014 | **0.7192** +/- 0.0024 |


## Contributing

This code repository is for double-blind paper review only.
