# Setup

The easiest way to perform the attacks is to run the code in a Docker container. To build the Docker image, run the following script:
```bash
docker build -t dori  .
```

# Performing NeMo
We include the scripts from the official NeMo repo [github.com/ml-research/localizing_memorization_in_diffusion_models](https://github.com/ml-research/localizing_memorization_in_diffusion_models) to mirror their setting. The following descriptions are taken from their repo: The following steps describe how to apply NeMo to detect memorizing neurons in Stable Diffusion. Each script provides multiple options; run a script with the option -h to get the list of options. Default values correspond to the settings used in the main paper. The first two steps can be skipped since we already provide the required statistics and thresholds.

## 1. Calculating Activation Statistics (Optional)
To identify neurons that memorize specific samples, we must first calculate the activation statistics on unmemorized samples. Use the following script:
```python 
python 1_compute_activations_statistics.py
```
Pre-computed activation statistics for Stable Diffusion v1-4 and 50,000 LAION prompts are provided at ```statistics/statistics_additional_laion_prompts_v1_4.pt```.

## 2. Calculate SSIM Thresholds (Optional)
In addition to activation statistics on unmemorized prompts, we need SSIM thresholds for the neuron detection algorithm. First, calculate the pairwise SSIM between different seeds of unmemorized prompts:
```python
python 2_compute_pairwise_ssim.py
```

Manually calculate the thresholds by loading the file with PyTorch and compute the mean and standard deviation. For the paper, the threshold is set to $0.428$, which corresponds to the mean SSIM score plus one standard deviation. This value is also set as the default in the following detection step.

## 3. Detect Memorization Neurons
To identify memorization neurons, run the following script. Both the initial selection and the refinement process are automatically executed:

```python
python 3_detect_memorized_neurons.py
```

## 4. Image Generation
To calculate metrics, generate the original images (without blocking neurons) and then generate images with the identified neurons blocked. Use the following scripts:

```python
python 4_generate_images.py --original_images -o=generated_images_unblocked
python 4_generate_images.py --refined_neurons -o=generated_images_blocked
```

## 5. Perform adversarial embedding optimization
To evaluate NeMo results with adversarial embeddings, run the following code:
```python
python 5_generate_images_adv_embeddings.py --f=link_to_nemo_results.csv -o=generated_images_dori 
```
The default parameters match the ones used during our main experiments. 

# Performing Wanda

## 1. Calculating the Input Norms
To identify weights that memorize specific samples, we must first calculate the input norms. Use the following script:
```python 
python wanda_01_get_input_norms.py --output input_norms.pkl
```

## 2. Image Generation
To calculate metrics, generate the original images (without blocking weights) and then generate images with the identified weights blocked. Use the following scripts:

```python
python wanda_02_generate_images.py --original_images -o=generated_images_unblocked
python wanda_02_generate_images.py --input_norm_path input_norms.pkl -o=generated_images_blocked 
```

## 5. Perform adversarial embedding optimization
To evaluate NeMo results with adversarial embeddings, run the following code:
```python
python wanda_04_attack.py --input_norm_path input_norms.pkl -o=generated_images_dori 
```
The default parameters match the ones used during our main experiments. 

# Evaluation Metrics

After generating images, compute the metrics by running the scripts in the [metrics](metrics) directory. For all metrics, provide the link to the CSV result file containing the detected neurons. To split the results into VM and TM prompts, also provide a link to the original prompt file with ```-p=prompts/memorized_laion_prompts.csv```

For SSCD-based metrics, download the model via ```wget https://dl.fbaipublicfiles.com/sscd-copy-detection/sscd_disc_mixup.torchscript.pt``` and place it in the project´s root folder.

## Memorization
The memorization metrics measure the degree of memorization still present in the generated images. Generate images for each memorized prompt with activated/deactivated memorization neurons and measure the cosine similarities between image pairs using SSCD embeddings to quantify memorization. Additionally, measure the degree of memorization towards the original training images. First, download the original images following the URLs provided in the [prompt file](prompts/memorized_laion_prompts.csv). Ensure the downloaded images are enumerated like ```0001_first_image.jpg``` to match the generated and original images in the script. Higher SSCD scores indicate a higher degree of memorization. Run the following scripts to compute the memorization metrics:

```python
python metrics/compute_sscd_gen.py -p=prompts/memorized_laion_prompts.csv -f=generated_images_blocked -r=generated_images_unblocked
python metrics/compute_sscd_orig.py -p=prompts/memorized_laion_prompts.csv -f=generated_images_blocked -r=original_images
```

## Diversity
The diversity metric assesses the variety of images generated for the same memorized prompt with different seeds. Deactivating memorization neurons increases the diversity of generated images. Compute the diversity metric by running the following script:
```python
python metrics/compute_diversity.py -p=prompts/memorized_laion_prompts.csv -f=generated_images_blocked
```

## Quality
To assess the overall image quality of a DM with activated/deactivated neurons, compute the Fréchet Inception Distance (FID), CLIP-FID, and Kernel Inception Distance (KID) on COCO prompts using the [clean-fid](https://github.com/GaParmar/clean-fid) implementation. This implementation requires two folders: one with the original images and one with the generated images. 

Additionally, compute the similarities between the generated images and the input prompts using CLIP scores to ensure alignment between the generated images and their prompts. Run the following script to compute the prompt alignment:

```python
python metrics/compute_prompt_alignment.py -p=prompts/memorized_laion_prompts.csv -f=generated_images_blocked
```
