This anonymized/redacted version of the code is provided for reference during review processes only.

The watermarking and watermark detection code itself is an extension of the `WatermarkLogitsProcessor` and `WatermarkDetector` classes.

A pipeline abstraction was implemented to manage the "generation", paraphrase "attack", and "evaluation" or detection phases. The general setup is that data, i.e. sets of generated samples, is written and read by each stage as "json lines" files `*.jsonl` with associated metadata files `*.json` to keep track of parameter settings used at each stage.

A prose version of usage instructions for the pipeline is described in a separate markdown file here: [PIPELINE.md](PIPELINE.md)

## wandb

The pipeline scripts, and in particular, the evaluation stage where detection is run and generation quality metrics are computed, are configured to push results to weights and biases (wandb). The figures in the paper are produced by:
1. sketching out the charts in wandb using filters and tags 
2. exporting/downloading the csv's of the data for each chart, and 
3. loading them in a notebook to format plots as necessary.

Alternately, the evaluation stage also saves a jsonl file where every line is a set of generations and all associated metrics and detection scores computed for it. This can also be loaded and analyzed manually in pandas, though the ROC space analyzes and average@T series for some metrics will have to be recomputed.

## llama

In order to use the llama model, you need to bring-your-own-weights, and then covert them to the huggingface format. 

