# BitMark for Infinity: Watermarking Bitwise Autoregressive Image Generative Models

## References

This code is based on the code for Infinity: https://github.com/FoundationVision/Infinity (Infinity ∞: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis, Jian Han et al. 2025, CVPR) and on the code from A Watermark for Large Language Models (John Kirchenbauer et al. 2023, ICML) https://github.com/jwkirchenbauer/lm-watermarking.

## Abstract 

State-of-the-art text-to-image models like Infinity generate photorealistic images at an unprecedented speed. These models operate in a bitwise autoregressive manner over a discrete set of tokens that is practically infinite in size. However, their impressive generative power comes with a growing risk: as their outputs increasingly populate the Internet, they are likely to be scraped and reused as training data-potentially by the very same models. This phenomenon has been shown to lead to model collapse, where repeated training on generated content, especially from the models' own previous versions, causes a gradual degradation in performance. A promising mitigation strategy is watermarking, which embeds human-imperceptible yet detectable signals into generated images-enabling the identification of generated content. In this work, we introduce BitMark, a robust bitwise watermarking framework for Infinity. Our method embeds a watermark directly at the bit level of the token stream across multiple scales (also referred to as resolutions) during Infinity's image generation process. Our bitwise watermark subtly influences the bits to preserve visual fidelity and generation speed while remaining robust against a spectrum of removal techniques. Furthermore, it exhibits high radioactivity, i.e., when watermarked generated images are used to train another image generative model, this second model's outputs will also carry the watermark. The radioactive traces remain detectable even when only fine-tuning diffusion or image autoregressive models on images watermarked with our BitMark. Overall, our approach provides a principled step toward preventing model collapse in image generative models by enabling reliable detection of generated outputs.


## Preparation


### Installing Requirements

The requirements are specified in the environment.yml file and can be installed with conda.

### Prepare Models

You need to download the Infinity-2B model and the VAE-32 reg, available here https://huggingface.co/FoundationVision/Infinity.

## Running BitMark

BitMark is defined in extended_watermark_processor.py. The file tools/comprehensive_infer.py offers a way to run the Infinity model and watermark images using our BitMark. The file can be executed by calling: 

```
python tools/comprehensive_infer.py --model_path "./infinity_2b_reg.pth" --vae_type 32 --vae_path "infinity_vae_d32_reg.pth" --add_lvl_embeding_only_first_block 1 --model_type "infinity_2b" --seed 0 --watermark_scales 2 --watermark_delta 2
--watermark_context_width 2 --out_dir "./" --jsonl_filepath = "captions.json"
```

Where the model paths need to be update with the paths where the respective models are saved and the --jsonl_filepath leads to a json, which contains the prompts used for image generation. The delta and sequence length (watermark_context_width) can be adapted if wished.  

### Additional Experiments

The robustness evaluation is in tools/robustness_test.py and to run it, it requires a path of clean images and a path of watermarked images and then computes the TPR@1%FPR for all images in the watermarked path against all attacks.   

Our novel BitFlipper attack is applied in flipper.py
