## Image2Text2Image DP: Differential Privacy Data Synthesis Cross Modality
We believe that the text modality can introduce greater variation in image generation under DP constraints while not significantly compromising the quality of the generated data. I personally name our proposed method as **Image2Text2Image DP**, or **DPITI** for short.

### 💡 Proposed Methodology
Following the work [Lin et al.(2024)](https://openreview.net/forum?id=YEhQs8POIo),[Xie et al.(2024)](https://arxiv.org/abs/2403.01749), we aim to exploit more potential from LLM and diffusion models, thus we proposed the following method to enhance the variety of synthesized data while preserving accuracy.

<img src="docs/images0.png" width="300">

### Environment
```bash
# Please prepare torch, transformers and diffusers yourself
conda create -n textdp python=3.12
conda activate textdp

conda install transformers diffusers datasets
# installing PE...
conda install -y -c pytorch -c nvidia faiss-gpu=1.8.0
pip install "private-evolution @ git+https://github.com/microsoft/DPSDA.git"
pip install "private-evolution[image,text] @ git+https://github.com/microsoft/DPSDA.git"
pip install einops
# Others...
```
