# Break the Trade-off Between Watermark Strength and Speculative Sampling Efficiency for Language Models

This repository contains the implementation and experimental code for analyzing the trade-off between watermark strength and speculative sampling efficiency in language models.

## Repository Structure

### `real/`
Contains the main experimental framework and watermarking implementations:
- **`unbiased_watermark/`**: Core watermarking algorithms including SynthID, Gumbel-based methods.
- **`my_experiment/`**: Experiment configurations and execution scripts for different model combinations
- **`experiments/`**: General experiment framework and worker implementations
- **`accuwm/`**: Various language model inference algorithms. Including Algorithm 1 in the paper.

### `simulation/`
Contains simulation code for theoretical analysis:
- **`utils.py`**: Mathematical utility functions for sampling and probability computations
- **`simulation_trade-off_linear.py`**: Trade-off curves for linear watermarked classes.
- **`simulation_trade-off_Hu-Google.py`**: Trade-off curves for Hu's and Google's classes.

## Citation

This code is built based on the work:

```bibtex
@inproceedings{
  hu2024inevitable,
  title={Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models},
  author={Hu, Zhengmian and Huang, Heng},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year={2024},
}
```

## Requirements

- Python 3.9+
- PyTorch >= 2.0.0
- Transformers >= 4.30.0
- NumPy >= 1.24.0
- Additional dependencies listed in `requirements.txt`

## Usage

See individual experiment directories for specific usage instructions and configuration examples.
