# QDOT: An Efficient Quantile-weighted Distance Metric for Geometric Comparison via Optimal Transport
This repository includes the implementation of our work **"QDOT: An Efficient Quantile-weighted Distance Metric for Geometric Comparison via Optimal Transport"**

## Introduction
Brief introduction to directories and files:
* `QDOT/`: Core code for our implementation of the QDOT/IQDOT algorithm package.
* Experiments for validating metric properties:
   * `cross_space/`: Code for implementing cross-space tasks.
   * `time_cost/`: Code for Time cost test.
   * `transfer_learning/`: Code for Transfer Learning
   * `Molecule_Generation/`: Code for Molecule Generation.
   * `toy_example/`: Code for a toy example and parameter analysis.

## Requirements
* python >= 3.8
* numpy
* scipy
* matplotlib
* sklearn
* pytorch >= 2.4.1
* pandas
* POT

---

###  Cross Space Tasks
   1. Get the [Mesh Data from Deformation Transfer for Triangle Meshes](https://people.csail.mit.edu/sumner/research/deftransfer/data.html).
   2. Navigate to the `cross_space` directory and run the script:
      ```bash
      cd cross_space
      python main.py
      ```
### Time Compare
   1. To benchmark the execution time, navigate to the `time_cost` directory and run the following script:
      ```bash
      cd time_cost
      python main.py --methods IQDOT-50, SGW-50 --sizes 100, 1000, 10000
      ```
   2. The benchmark results will be saved in the `benchmark_out` directory. We have also included the results reported in our paper within this directory for reference.
### Transfer Learning

   1. Download the ModelNet40 and ShapeNetPart datasets from the [PointCloudDatasets](https://github.com/antao97/PointCloudDatasets).
   2. Navigate to the `transfer_learning` directory and run the script to extract the required data classes:
      ```bash
      cd transfer_learning
      python dataset.py
      ```
   3. Compute the distance matrix:
      ```bash
      python main.py --method IQDOT --point_step 1
      ```
   4. Evaluate the results:
      ```bash
      python evaluate.py --method IQDOT --point_step 1
      ```
*Note: The data extraction script (`dataset.py`) is adapted from the [PointCloudDatasets](https://github.com/antao97/PointCloudDatasets) repository. We thank the authors for their contribution.*

### Molecular Generation
**Backbone**: [EDM](https://github.com/ehoogeboom/e3_diffusion_for_molecules), [UniGEM](https://github.com/fengshikun/UniGEM)

1. Navigate to the `Molecule_Generation` directory and clone the original backbone repositories. Then, follow their instructions to download the QM9 and Drug datasets.
   ```bash
   cd Molecule_Generation
   git clone https://github.com/ehoogeboom/e3_diffusion_for_molecules.git
   git clone https://github.com/fengshikun/UniGEM.git
   ```

2. Replace the `en_diffusion/en_diffusion.py` file in the cloned EDM & UniGEM repository with the one provided in our `Molecule_Generation` directory.

3. To train EDM and UniGEM on the QM9 dataset, run the following script:
   ```bash
   bash QDOT_train_QM9.sh
   ```

   To evaluate the trained models on the QM9 dataset:
   ```bash
   bash QDOT_eval_QM9.sh
   ```

4. To fine-tune UniGEM on the DRUG dataset, first download the pre-trained checkpoints as instructed in the original [UniGEM](https://github.com/fengshikun/UniGEM) repository. Then, run the fine-tuning script:
   ```bash
   bash QDOT_ft_DRUG.sh
   ```
   To evaluate the fine-tuned model on the DRUG dataset:
   ```bash
   bash QDOT_eval_DRUG.sh
   ```
5.  We provide pre-trained checkpoints for evaluation with **QDOT-0.3** in the following directories: `Molecule_Generation/EDM_QM9_ckpt`, `Molecule_Generation/UniGEM_QM9_ckpt`, and `Molecule_Generation/UniGEM_DRUG_ckpt`. You are welcome to use these for testing and reproducing our results. Checkpoints for training will be released soon.

### Toy Example and Parameter Analysis
   * Please check the code and results directly in the`toy_example/toy_example.ipynb` file.

---

## Main References
Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Romain Tavenard, Alexander Tong, and Titouan Vayer. "POT Python Optimal Transport library." Journal of Machine Learning Research 22(78): 1-8, 2021. [\[Web\]](https://pythonot.github.io/)

Vayer Titouan, Flamary Rémi, Tavenard Romain, Chapel Laetitia and Courty Nicolas. "Sliced Gromov-Wasserstein." NeurIPS 2019-Thirty-third Conference on Neural Information Processing Systems. Vol. 32. 2019. [\[Github\]](https://github.com/tvayer/SGW)

Hoogeboom Emiel, Satorras Vïctor Garcia, Vignac Clément and Welling Max. "Equivariant diffusion for molecule generation in 3d." International conference on machine learning. PMLR, 2022. [\[Github\]](https://github.com/ehoogeboom/e3_diffusion_for_molecules)

Shikun Feng and Yuyan Ni and Lu yan and Zhi-Ming Ma and Wei-Ying Ma and Yanyan Lan. "UniGEM: A Unified Approach to Generation and Property Prediction for Molecules." The Thirteenth International Conference on Learning Representations. [\[Github\]](https://github.com/fengshikun/UniGEM)






