# DCMEM

## Datasets

The CUBICC dataset can be obtained from https://polybox.ethz.ch/index.php/s/LRkTC2oa6YHHlUj/download [1]. The MNIST-SVHN dataset is available through `torchvision.datasets` and does not require manual downloading.

## Input and Output Directories

The datasets used by the program are in the ```data/datasets``` folder. The output files are under the ```data/runs``` folder. 

## File description

The file descriptions are as follows:
```
Datasets/dataset_CUBICC.py: Loads and preprocesses the CUBICC dataset for training and evaluation.
Datasets/dataset_MNIST_SVHN.py: Loads and preprocesses the MNIST-SVHN dataset for training and evaluation.

Models/encoder_decoder: Defines encoder and decoder architectures used by multimodal VAE models.
Models/mvae_CUBICC.py: Implements the multimodal VAE specifically for the CUBICC dataset.
Models/mvae_MNIST_SVHN.py: Implements the multimodal VAE specifically for the MNIST-SVHN dataset.
Models/mvae.py: Provides base multimodal VAE model components shared across datasets.

utils/classifier.py: Contains implementation of classifier models used for evaluation.
utils/test_fuctions_CUBICC.py: Provides evaluation and testing functions specific to the CUBICC dataset.
utils/test_fuctions_MNIST_SVHN.py: Provides evaluation and testing functions specific to the MNIST-SVHN dataset.
utils/test_fuctions_utils.py: Provides evaluation and testing functions for all datasets.

main_CUBICC.py: Entry point for training and testing the DCMEM model on the CUBICC dataset.
main_MNIST_SVHN.py: Entry point for training and testing the DCMEM model on the MNIST-SVHN dataset.
```

## Requirements

```
python                       3.7.12
argparse                     1.4.0
imageio                      2.1.2
matplotlib                   3.5.3
numpy                        1.21.6
pandas                       1.3.5
scanpy                       1.9.3
scikit-image                 0.19.3
scikit-learn                 1.0.2
scipy                        1.7.3
sparse                       0.13.0
torch                        1.13.1
torchvision                  0.9.1+cu101
torchmetrics                 0.11.4
tqdm                         4.66.4
```

## Experiments

To train and evaluate on the MNIST-SVHN dataset, use the following commands for different supervision levels and missing modality settings.

### Fully Paired Training

```bash
python main_MNIST_SVHN.py -sup 1.0
```

### Partially Paired Training With Missing Modality (e.g., MNIST missing)

```bash
python main_MNIST_SVHN.py -sup 0.5 --missing mnist
```

## References

[1] Palumbo E, Manduchi L, Laguna S, et al. Deep generative clustering with multimodal diffusion variational autoencoders[C]//The Twelfth International Conference on Learning Representations. 2024.
