# Unsupervised Learning of Object-Centric Representation from Multi-Viewpoint Scenes

This repository is the implementation of Unsupervised Learning of Object-Centric Representation from Multi-Viewpoint Scenes (LORM).

### Dependencies

```
torch			1.11.0
torchvision		0.12.0
h5py			3.6.0
numpy			1.21.2
tensorboard		2.6.0
moviepy			1.0.3                                         
```

### Dataset

To create CLEVR-A dataset used in this paper, one can run this command:

```
cd dataset/create_clevr_a
source create_blend.sh
source create_pngs.sh
source create_h5.sh
```

To create SHOP dataset used in this paper, one can run this command:

```
cd dataset/create_shop
source create_blend.sh
source create_pngs.sh
source create_h5.sh
```

### Training

To train the model in the paper, one can run this command:

```
source run.sh
```

Check `src/train.py` to see the full list of training arguments.

To modify the hyperparameters, one can edit the file `run.sh`.

To train on different dataset, one can modify the value of `DATANAME` variable (indicating the dataset name) and `DATAPATH` variable (indicating the path of dataset) in the file `run.sh`.

### Outputs

The training code produces Tensorboard logs. To see these logs, run Tensorboard on the logging directory. These logs contain the training loss curves and visualizations of reconstructions and object attention maps.

### Code Files

This repository provides the following files.

\- `train.py` contains the training script.

\- `lorm.py` provides the model class for LORM.

\- `data_h5.py` contains the dataset class.

\- `dvae.py` provides the model class for Discrete-VAE.

\- `utils.py` provides helper classes and functions.