# SceneForger

SceneForger is a tool for training and generating 3D shapes with captions. This repository contains the code for training and an example script for plotting combined 3D shapes and captions. We release the base code for combining objects at batch creation time. We release here the best performing backbone implementation, Uni3D. However, the datasets.py code is general and can be plugged into any model.

## Repository Structure

- `SceneForger/`: Contains all the code for training. Adapted from [Uni3D](https://github.com/Uni3D/Uni3D).
- `example_combine.py`: Example script for plotting combined 3D shapes and captions.
- `samples/`: Contains point cloud-captions for running the example script.

## Installation

To install SceneForger, follow the [Uni3D installation instructions](https://github.com/Uni3D/Uni3D#installation). Additionally, you will need to install the following libraries:

```bash
pip install transformers vllm
```

## Usage

To run the example object combination script, use the following command:

```bash
python example_combine.py
```

## Pre-training
1. Please refer to [DATASETS.md](data/DATASETS.md) for pre-train dataset preparation.
2. [Recommended 🤗] Download the [clip model](https://huggingface.co/timm/eva02_enormous_patch14_plus_clip_224.laion2b_s9b_b144k/blob/main/open_clip_pytorch_model.bin) and put it in `/path/to/clip_model` folder.
3. [Recommended 🤗] Download the [initialization model](https://huggingface.co/timm/eva_giant_patch14_560.m30m_ft_in22k_in1k/blob/main/model.safetensors) and put it in `/path/to/init_model` folder.
4. Run `bash scripts/pretrain.sh` to pre-train the model on ensemble datasets.

This will generate combined samples using the sample data provided in the `samples/` folder.

## Testing
1. Please refer to [DATASETS.md](data/DATASETS.md) for uni3D test dataset preparation.
3. Run `data/create_scannet_instances_dataset.py` for splitting the Scannet dataset into object instances and labels.
4. Run `bash scripts/eval.sh` to test the model on the test datasets.

## Acknowledgements

We would like to thank the [Uni3D](https://github.com/baaivision/Uni3D) team for their foundational work, for which SF-Uni3D builds upon. We also thanks the [OpenShape](https://github.com/Colin97/OpenShape_code) and [ViT-Lens](https://github.com/TencentARC/ViT-Lens) teams for their code which we used to test on their backbones.
