# Readout representation on vision models

Examine the representation size by reconstructing images from noised features. The variance of the noise is calibrated to realize specified correlation distance between the original and noised feature.

```
250527_readout_vision/
├── README.md
├── sample_images.py        # randomly sample images from ImageNet ILSVRC dataset
├── save_features.py        # save features of the models
├── reconstruct.py          # reconstruct images from original and perturbed features
├── plot_images.py          # visualize reconstruction results
├── notebooks/              # notebooks to visualize results
└── configs/                # config files
```

## Config

All configuration parameters are defined in a single YAML file.

| Section         | Key                                       | Type              | Optional | Description |
|-----------------|-------------------------------------------|-------------------|----------|-------------|
| Model           | `model.name`                              | string            | No       | Model family name (e.g., `clip-tfm`, `dinov2`). |
|                 | `model.pretrained`                        | string            | No       | Path to pretrained weights or HuggingFace model ID. |
|                 | `model.model_alias`                       | string            | Yes      | Alias used for naming results. Defaults to `name` or `pretrained`. |
| Experiment      | `exp_name`                                | string            | No       | Name of the experiment. Used in result directories. |
| Data            | `data.dataset_name`                       | string            | No       | Dataset name for organizing output. |
|                 | `data.image_dir`                          | string            | No       | Path to directory containing source images. |
|                 | `data.image_names`                        | list of strings   | Yes      | List of image names (without extension). |
|                 | `data.image_names_path`                   | string            | Yes      | YAML file listing image names. Overrides `image_names` if provided. |
|                 | `data.image_ext`                          | string            | No       | Image extension (e.g., `.JPEG`, `.jpg`). |
| Noise           | `noise.target_corr_dists`                 | list of floats    | No       | Target correlation distances between original and noised features. |
|                 | `noise.noise_seeds`                       | list of integers  | No       | Random seeds for noise generation. |
|                 | `noise.tol`                               | float             | No       | Tolerance for matching target correlation distance. |
| Batch           | `batch_size`                              | int               | No       | Number of images optimized simultaneously on one GPU. |
| Pipeline        | `pipeline.num_iterations`                 | int               | No       | Number of optimization iterations. |
|                 | `pipeline.eval_interval`                  | int               | Yes      | Interval for evaluation. Default is `1`. |
|                 | `pipeline.grad_clip`                      | bool              | Yes      | Whether to enable gradient clipping. Default is `False`. |
|                 | `pipeline.log_interval`                   | int               | Yes      | Terminal print interval. Default is `-1` (disabled). |
|                 | `pipeline.wandb_log_interval`             | int               | Yes      | Logging interval to Weights & Biases. Default is `1`. |
| Generator       | `generator.name`                          | string            | No       | Generator type (e.g., `deepimageprior`). |
| Critic          | `critic.name`                             | string            | No       | Loss function (e.g., `mse`). |
| Optimizer       | `optimizer.name`                          | string            | No       | Optimizer name (e.g., `adamw`). |
|                 | `optimizer.lr`                            | float             | No       | Learning rate. |
|                 | `optimizer.scheduler.name`                | string            | Yes      | Name of learning rate scheduler (e.g., `LinearLR`). |
|                 | `optimizer.scheduler.start_factor`        | float             | Depends  | Scheduler-specific: start scaling factor. |
|                 | `optimizer.scheduler.end_factor`          | float             | Depends  | Scheduler-specific: end scaling factor. |
| Layers          | `layers`                                  | list of strings   | No       | Human-readable layer names to reconstruct from. |
|                 | `layer_mapping`                           | dict              | No       | Maps human-readable names to actual model layer paths. |
| Output Override | `feature_dir`                             | string            | Yes      | Override default feature directory. |
| Logging         | `wandb`                                   | bool              | Yes      | Enable Weights & Biases logging. |
| Save Features   | `save_target_features`                    | bool              | Yes      | Save target features. |

---

### Notes

- **Per-layer execution**: All reconstructions are run independently for each layer specified in `layers`.
- **Distributed setup**: Supports multi-node and multi-GPU execution via shared storage, with JSON database tracking and file locking.



## Usage

### 0. Sample images from ImageNet ILSVRC test split
If you use images sampled from ImageNet ILSVRC dataset, run 
```
python sample_images.py --n 256 --output_dir path/to/the/outupt/dir/
```
It will sample `n` images from `ILSVRC/imagenet-1k` dataset from HuggingFace, and save them in `output_dir`.

### 1. Save features of the model
First, save features of the model by running:
```
python save_features.py path/to/the/config.yaml
```
It wil save features in `output/readout_vision/features/{model_alias}/{layer_name}/{image_name}.mat`.

Args:
- config_path: path to the config
- device: device to perform the feature extraction. default `cuda` (e.g. `cuda`, `cpu`).


### 2. Run experiments
Once you saved the features, run experiments with:
```
python reconstruct.py path/to/the/config.yaml
```
Args:
- config_path: path to the config
- device: device to perform the experiments, default `cuda`.

It will reconstrct images from perturbed features that have speficied correlation distances. As a result, it will perform reconstructions for roughly $N_{img} \times N_{layer} \times N_{distance} \times N_{noise\_seeds} \times N_{gen\_seeds}$, which can be a fairly large number.

#### Multi-node and multi-gpu support.
 To perform experiments efficiently by leveraging computational resources, this scripts allow you to run the experiment with multiple nodes, as long as they share the storage. You can just run the script multiple times on each nodes. Also, to leverage multiple GPU, run the script multiple times with changing `--device` parameter. The simple trick is a json file that tracks which combination has done and which are pending. 


#### Output
Results will be saved in `output/readout_vision/results/{model_alias}/{dataset_name}/{exp_name}/`

This directory has following structure: layer -> seed -> images

```
{exp_dir}/
├── experiment_db.json          # progress and summary of the results
├── experiment_db.lock          # lock file for the experiment_db.json
├── {layer}/
│   ├── corr_dist_0.2/
│   │   │   ├── noise_seed_0/
│   │   │   │   ├── {image_name}/
│   │   │   │   │   ├── final.png
│   │   │   │   │   ├── history.csv
│   │   │   │   │   ├── summary.json
│   │   │   │   │   └── snapshot/
│   │   │   │   │       ├── step_1000.png
│   │   │   │   │       ├── step_2000.png
│   │   │   │   │       └── ...
│   │   │   ├── noise_seed_1/
│   │   │   │   └── {image_name}/
│   │   │   │       └── ...
│   ├── corr_dist_0.4/
│   │   └── ...
│   ├── ...
│   └── corr_dist_0.0/
│           └── noise_seed_None/    # distance 0 = no noise = no noise seeds
│               └── {image_name}/
│                   └── ...
```



The `experiment_db.json` contains the status of experiments and the short summary of results. It stores the following contents for each experiment:
- `status`: pending, running, finished, or error
- `correlation_distance`: the correlation distance between the original feature and the target feature speficied in the config. Note that there might be an error between this value and the actual correlation distance.
- `true_target_*`: distance between the original and the target feature. If there's no postfix, it means the distance in the target layer.
- `target_recon_*`: distance between the original and reconstructed feature at the final iteration. If there's no postfix, it means the distance in the target layer.
- `true_recon_*`: distance between the target and reconstructed feature at the final iteration. If there's no postfix, it means the distance in the target layer.
- `pixel_*`: metrics between the original and reconstructed image at the final iteration.

If you only need quantitative results of the reconstructed images, looking up the `experiment_db.json` would suffice.

### 3. Visualize qualitative results
To visualize reconstructed images, run
```
python plot_images.py path/to/the/config.yaml
```
It will plot the reconstructed images and save them in `{exp_dir}/summary/recon_images/{image_name}/{noise_seed}.png`

### 4. Calculate representation size
```
python scripts/250527_readout_vision/calculate_size.py path/to/the/config.yaml
```

### 5. Visualize quantitative results
```
python scripts/250527_readout_vision/plot_results.py path/to/the/config.yaml
```

### 6. Calculate perceptual metrics
```
python scripts/250527_readout_vision/calculate_perceptual_metrics.py path/to/the/config.yaml
```
