# Evaluation Toolkit

This directory contains tools with all scripts for scoring generated images and checking result sets.

## Core Files
- `eval.py`: End-to-end metrics runner; calculates CLIP, ImageReward, PSNR, SSIM, LPIPS and other metrics, compares with reference folder.
- `imagereward.py`: Lightweight driver, only uses ImageReward to score prompt-image pairs.
- `clip.py`: Standalone CLIP score calculator, uses prompts stored in image EXIF metadata.
- `psnrssim.py`: Folder comparison helper, reports PSNR/SSIM/LPIPS and other prompt-independent metrics.
- `collect_results.py`: Automated wrapper, scans result folders, calls `eval.py`, and aggregates CSV/JSON/Markdown reports.
- `eval.sh`: Shell helper script, samples image subsets and runs `imagereward.py`; accepts `PYTHON_BIN` override.
- `compare_images.sh`: Launches static HTML dashboard, visualizes comparison of Taylor vs HiCache outputs under `results/`.

## Usage Instructions
- Before running Python scripts, please activate the repository virtual environment (`python3.10 -m venv .venv && source .venv/bin/activate`).
- For GPU-dependent tasks (ImageReward, LPIPS), ensure CUDA is visible and required checkpoints are located under the default referenced `/root/autodl-tmp`.
- Batch processing tools (`collect_results.py`, experiment scanning scripts) look for generated images under `results/`; if output is located elsewhere, please adjust flags.
- When calling shell wrappers from outside this folder, it is recommended to use absolute paths (such as `bash evaluation/eval.sh --help`) so that scripts can correctly parse sibling modules.
