### 1. Preparing Datasets

Repo Structure
```bash
MAM-CLIP/:
    |--cfg.yaml
    |--train.py/ Image-Text model training file 
    |--loaders.py/ Dataloaders for training 
    |--files/
    |   |--dataset.py: Image-Text Dataset  
    |   |--model.py: Vision Language Model and Pytorch Lightning Model training
    |   |--nnblocks.py: LayerNorm, Transformer which are used in model.py
```

### 2. Loading Model Weights

Model weights are available in huggingface and will be shared publicly upon acceptance. load_model_from_hf.ipynb file can be used to load the model and pretrained model weights.






