# MagicDrive: Street View Generation with Diverse 3D Geometry Control

## Environment Setup
The code is tested with `Pytorch==1.10.2` and `torchvision==0.11.3`.
You should have these packages before starting. To install additional packages, follow:
```bash
cd ${ROOT}
pip install -r requirements.txt
```

We opt to install the source code for the following packages

```bash
# install third-party
${ROOT}/third_party/
├── bevfusion -> based on db75150, there is a .diff for the change
├── diffusers -> based on v0.17.1 (afcca3916), there is a .diff for the change
└── xformers -> we minorly change 0.0.19 to install with pytorch1.10.2
```

## Pretrained Weights

Our training are based on [stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)

We assume you put them at `${ROOT}/../pretrained/` as follows:

```bash
{ROOT}/../pretrained/stable-diffusion-v1-5/
├── README.md
├── feature_extractor
├── model_index.json
├── safety_checker
├── scheduler
├── text_encoder
├── tokenizer
├── unet
├── v1-5-pruned-emaonly.ckpt
├── v1-5-pruned.ckpt
├── v1-inference.yaml
└── vae
```

## Datasets
Please prepare the nuScenes dataset as [bevfusion's instructions](https://github.com/mit-han-lab/bevfusion#data-preparation). If you want to run CVT models, please also download their annotations as demonstrated [here](https://github.com/bradyz/cross_view_transformers#data).

The data structure should looks like:

```bash
${ROOT}/../data/
├── cvt_labels_nuscenes
├── nuscenes
└── nuscenes_mmdet3d
```

## Image Generation
Our default log directory is `${ROOT}/magicdrive-log`. Please be prepared.

Run image generation with

```bash
python tools/test.py resume_from_checkpoint=${RUN_LOG_DIR} task_id=${ANY}
```

Check the experiment config for the two models we used

```bash
224x400 resolution: magicdrive_release/configs/exp/rawbox_mv2.0add0.1.yaml
272x736 resolution: magicdrive_release/configs/exp/rawbox_mv2.0+add0.1_nockpt.yaml
```

## Credit
We adopt following open-sourced projects:
- [bevfusion](https://github.com/mit-han-lab/bevfusion)
- [diffusers](https://github.com/huggingface/diffusers)
- [xformers](https://github.com/facebookresearch/xformers)
