
# Training Object Manipulation Model

This README provides step-by-step instructions for training a model for object manipulation. The process involves setting up the environment, cloning the repository, preparing datasets, and running the training script. Note that the training process is divided into stages (Stage 1 and Stage 2), which share the same script but differ in data loaders and trainable parameters.

## 1. Environment Setup

Install the required Python packages from the `requirements.txt` file. It is recommended to use a virtual environment (e.g., via `venv` or `conda`) to avoid dependency conflicts.

```bash
pip install -r requirements.txt
```

## 2. Clone the VACE Repository and Download Checkpoint

- Clone the VACE repository from GitHub.
- Download the model checkpoint (refer to the repository's documentation for the download link and instructions on where to place the checkpoint file).

```bash
git clone git@github.com:ali-vilab/VACE.git
```

## 3. Prepare the Image and Video Dataset

Prepare your dataset consisting of images and videos for training. Ensure the data is organized as specified in the training script, see the dataloader for details.

## 4. Train the Model

Execute the training script using the provided bash file. Configure the script for different stages by modifying parameters the data loader and trainable modules.

```bash
bash train.sh
```

### Training Notes
- **Stages**: Stage 1 and Stage 2 use the same training script (`train.sh`), but they differ in:
  - Data loaders: Stage 1 uses image-only loaders with randomly rendered object images, while Stage 2 incorporates high quality image and video loaders for background preservation and visual quality.
  - Trainable parameters: Stage 1 train both the main and control branch, while in Stage 2, only control branch is trainable.

