# VETA-DiT: Variance-Equalized and Temporally Adaptive Quantization for Efficient 4-bit Diffusion Transformers.

This repo contains the official code of our paper: VETA-DiT: Variance-Equalized and Temporally Adaptive Quantization for Efficient 4-bit Diffusion Transformers.

## Setup

Fitst, download and set up the repo.

Then create the environment and install required packages:

```bash
conda create -n veta-dit python=3.10
conda activate veta-dit
pip install -r requirements.txt
```

## Usage for DiT

### Calibration Data

Use the following command to generate the calibration data for VETA-DiT:

```bash
cd dit
bash get_calib_data.sh  # You can change the save path in the script
```

### Quantization and Inference

- Example for quantizing DiT-XL/2 with 100 timesteps into W4A8 on ImageNet 256x256 generation.

```bash
python ptq_inference.py \
    --image-size 256\
    --seed 42 \
    --num-sampling-steps 100\
    --ptq-config "./configs/w4a8.yaml"\  # You can change the quantization config file here
    --log "./logs/w4a8" \  # You can change the log path here
    --cfg-scale 1.5 \
    --argument_method "inco" \
    --smooth_quant \
    --gptq
```

- Example for quantizing DiT-XL/2 with 100 timesteps into W4A4 on ImageNet 256x256 generation.

```bash
python ptq_inference.py \
    --image-size 256\
    --seed 42 \
    --num-sampling-steps 100\
    --ptq-config "./configs/w4a4.yaml"\  # You can change the quantization config file here
    --log "./logs/w4a4" \  # You can change the log path here
    --cfg-scale 1.5 \
    --argument_method "inco" \
    --smooth_quant \
    --gptq
```

## Usage for Pixart

### Calibration Data

Use the following command to generate the calibration data for VETA-DiT:

```bash
cd pixart
bash get_calib_data_pixart.sh  # You can change the save path in the script
```

### Quantization and Inference

- Example for quantizing Pixart into W4A8 on COCO generation.

```bash
python ptq_inference.py --quant-config "./configs/w4a8.yaml" --log "./logs/w4a8" --argument_method "inco" --smooth_quant --gptq --prompt 'assets/coco_1024.txt'
```

- Example for quantizing Pixart into W4A4 on COCO generation.

```bash
python ptq_inference.py --quant-config "./configs/w4a4.yaml" --log "./logs/w4a4" --argument_method "inco" --smooth_quant --gptq --prompt 'assets/coco_1024.txt'
```

## Usage for Open-Sora

setup the environment following opensora1.2/OpenSORA/README.md

### Calibration Data

Use the following command to generate the calibration data for VETA-DiT:

```bash
cd dit
python get_calib_data.py configs/software_simulation_w4a8.py --prompt-path ./assets/t2v_samples_1.txt --seed 42  # You can change the save path in the script
```

### Quantization and Inference

- Example for quantizing Open-Sora into W4A8.

```bash
python ptq_inference.py configs/software_simulation_w4a8.py --save-dir "./logs/w4a8" --prompt-path ./assets/t2v_samples.txt
```

- Example for quantizing Open-Sora into W4A4.

```bash
python ptq_inference.py configs/software_simulation_w4a4.py --save-dir "./logs/w4a4" --prompt-path ./assets/t2v_samples.txt
```

## Evaluation

We use the [ADM’s evaluation suite](https://github.com/openai/guided-diffusion/tree/main/evaluations) to calculate FID, sFID, IS, and Precision.


# Acknowledgments

Our code was developed based on [opensora v1.2](https://github.com/hpcaitech/Open-Sora/tree/opensora/v1.2), [PixArt-sigama](https://github.com/PixArt-alpha/PixArt-sigma) and [ViDiT-Q](https://github.com/thu-nics/ViDiT-Q)