
This code is structured as follows:
stage1: VAE modules
stage2: diffusion module
clip: dataset alignment
Preprocessing: function to create subsets and generate

1. The VAE (or first stage module) is located in the `stage1` folder. After setting the correct configuration in the
corresponding config file and defining the appropriate data loader, you can train this module using `python main.py`.

2. After training the VAE:
    - Instantiate the CLIP model and transfer the VAE encoder weights to the CLIP model encoder.
    - Train the dataset alignment with `python cliptrainer.py`. This step can be skipped if you already have a
    pretrained set-transformer suitable for your tasks.

3. After training the set-transformer, extract the set-transformer checkpoint from the previous stage and save it.

4. For training the diffusion process:
    - Load the pretrained VAE and the pretrained set-transformer by setting their checkpoint locations in the config file.
    - The diffusion model is located in `stage2`. Train it by running `python dtrainer.py`.

Additionally, the `Helpers` folder contains some helper functions that enable reconstructing the pretrained checkpoint
 from vectors."
 In file compute_condion.py we show an example of building the input for set-transformer.
 In folder prerpocessing we provide example of code to compute pretrained features.


