# Natural Identifiers for Privacy and Data Audits in Large Language Models
Assessing the privacy of large language models (LLMs) presents significant challenges. In particular, most existing methods for auditing *differential privacy* require the insertion of specially crafted canary data *during training*, making them impractical for auditing already-trained models without costly retraining. Additionally, *dataset inference*, which audits whether a suspect dataset was used to train a model, is *infeasible* without access to a private non-member held-out dataset. Yet, such held-out datasets are often unavailable or difficult to construct for real-world cases since they have to be from the same distribution (IID) as the suspect data. These limitations severely hinder the ability to conduct scalable, *post-hoc* audits. To enable such audits, this work introduces **natural identifiers (NIDs)** as a novel solution to the above-mentioned challenges. NIDs are structured random strings, such as cryptographic hashes and shortened URLs, naturally occurring in common LLM training datasets. Their format enables the generation of unlimited additional random strings from the same distribution, which can act as alternative canaries for audits and as same-distribution held-out data for dataset inference. Our evaluation highlights that indeed, using NIDs, we can facilitate post-hoc differential privacy auditing *without any retraining* and enable dataset inference for any suspect dataset containing NIDs without the need for a private non-member held-out dataset.
## Dataset Inference

### Environment Setup
This project uses a Docker container based on the NVIDIA PyTorch image. To set up the environment, simply build and run the Docker container. The `Dockerfile` installs all necessary dependencies automatically.

To build and run the Docker image:

```bash
# Build the Docker image
docker build -t nids-image .

# Run the Docker container with GPU support
docker run --gpus all -it --rm nids-image /bin/bash
```



### Run the NIDs extraction
1. Run `run_regex.py` to extract potential NIDs
2. Run `blind_baseline_step_1.py` for the first filtering step
3. Run `blind_baseline_step_2.py` for the second filtering step

### Extract the features
1. Run `eval_secrets.py`

### Run Dataset Inference
1. Run `run_dataset_inference.py`


## DP Auditing
To execute the audit, use the following command:
`python run_audit.py`
Important hyperparameters
`--cardinality`, which defines the number of generated identifiers
`--number_of_canaries`, which the total number of natural identifier samples used in the audit.
`--black_box_audit`, to run black-box auditing
`--canary_types`, which specifies the natural identifier type
