# MSMT-GAN: Multi-Headed Spatial Dynamic Memory image refinement with Multi-Tailed Word level Initial Generation

This repository provides pytorch code that implements our proposed Multi-Headed Spatial Dynamic Memory GAN for text-to-image synthesis.

## Requirements

To install requirements:

```setup
conda create --name myenv --file spec-file.txt

(We use python 3.6 & pytorch version 0.4.1)
```

## Data
- Download metadata for CUB: [bird.zip](https://drive.google.com/open?id=1O_LtUP9sch09QH3s_EBAgLEctBQ5JBSJ) and COCO: [coco.zip](https://drive.google.com/open?id=1rSnbIGNDGZeHlsUlLdahj0RJ9oo6lgH9) and extract files to the `data/` directory.

- Download the [CUB](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html) image data. Extract image files to `data/birds/`

- Download [COCO](http://cocodataset.org/#download) image data. Extract images files to `data/coco/`

## Training

To train the MSMT-GAN model(s) in the paper, navigate to code/ and run:

```
- CUB: python -u main.py --cfg cfg/train_bird.yml --gpu 0
- COCO: python -u main.py --cfg cfg/train_coco.yml --gpu 0
```

Training hyper-parameters are specified in code/cfg/train_bird.yml and code/cfg/train_coco.yml

## Evaluation

To generate 30000 synthetic images for a pretrained model, specify model paths in code/cfg/eval_bird.yml OR code/cfg/eval_coco.yml, and after navigating to code/ , run:

```
- CUB: python -u main.py --cfg cfg/eval_bird.yml --gpu 0 
- COCO: python -u main.py --cfg cfg/eval_coco.yml --gpu 0 
```

These commands for image generation also print's the mean R-precision values over the generated image set.

To copute FID Score for generated images, we use [The official Tensorflow FID code](https://github.com/bioinf-jku/TTUR.git)

To compute Inception Score for generated images, we use:
- [Inception Score for CUB](https://github.com/hanzhanggit/StackGAN-inception-model).
- [Inception Score for COCO](https://github.com/openai/improved-gan/tree/master/inception_score).

## Pre-trained Models

- Download the [DAMSM model for CUB](https://drive.google.com/open?id=1GNUKjVeyWYBJ8hEU-yrfYQpDOkxEyP3V) and save it to `DAMSMencoders/`
- Download the [DAMSM model for COCO](https://drive.google.com/open?id=1zIrXCE9F6yfbEJIbNP5-YrEe2pZcPSGJ) and save it to `DAMSMencoders/`
- [MSMT-GAN for CUB](https://drive.google.com/file/d/1eEUW7j9v5eVVUAx4QUogTNPaMWDVbqK2)
- [MSMT-GAN for COCO](https://drive.google.com/file/d/1dNxNpT2s5lzmYRL9HrpK-LaBYfoNTmzZ)

Our MSMT-GAN models have been pretrained using the hyper-parameters specified in code/cfg/train_bird.yml and code/cfg/train_coco.yml. 

## Results

Our models achieve the following performance on the CUB and COCO datasets in comparison to previous methods:

|Model name          |FID ↓   | R-precision ↑  |IS ↑ |
|  ----------------- |------- | -------------- |-----|
|  **CUB** ||||
| AttnGAN | 14.01   |   67.82% ± 4.43%      |  4.36 ± 0.03 |
| DM-GAN | 11.91   |    76.58% ± 0.53%      |  4.71 ± 0.06 |
| MSMT-GAN (Ours) | 9.34   |    80.82% ± 0.54%      |  4.55 ± 0.06 |
|  **COCO** ||||
| AttnGAN | 29.53   |   85.47% ± 3.69%      |  25.89 ± 0.47 |
| Obj-GAN | 24.70   |   91.91% ± 2.37%      |  27.32 ± 0.40 |
| DM-GAN | 24.24   |    92.23% ± 0.37%      |  32.43 ± 0.58 |
| MSMT-GAN (Ours)| 23.22   |    92.46% ± 0.28%      |  28.91 ± 0.35 |


## Acknowledgements
This code borrows from the [AttnGAN](https://github.com/taoxugit/AttnGAN) and [DM-GAN](https://github.com/MinfengZhu/DM-GAN) repositories.
Many thanks.

## License

This code is released under the MIT License (refer to the LICENSE file for details).

