# iFusion

This is the code of our ECCV'24 submission, "iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views."

## Installation
```bash
pip install -r requirements.txt
```

Download Zero123-XL and place it under `ldm/ckpt`
```bash
cd ldm/ckpt && wget https://zero123.cs.columbia.edu/assets/zero123-xl.ckpt
```

## Usage

Run demo by specifing the image directory that contains at least 2 images
```bash
python main.py data.image_dir=asset/sorter
```
The output includes a `transform.json` (camera pose estimation) and `lora.ckpt` (multi-view fine-tuning). Qualitative visualization of novel view synthesis is shown at `demo.png`. Please refer to `config/main.yaml` for detailed hyper-parameters.

An example result:
- Input views
  | Reference view          | Query view              |
  |-------------------------|-------------------------|
  | ![](asset/sorter/0.png) | ![](asset/sorter/1.png) |
- Zero123 (1 view)
  ![](zero123.png)
- Zero123 + iFusion (2 views)
  ![](zero123+ifusion.png)

## Acknowledgements
This repo is a wild mixture of [zero123](https://github.com/cvlab-columbia/zero123), [threestudio](https://github.com/threestudio-project/threestudio), and [lora](https://github.com/cloneofsimo/lora). We thank the contributors for their excellent work.
