# Evaluation Scripts

This directory houses the quantitative evaluation pipeline for assessing dense monocular depth maps against metric ground-truth fields. 

Similar to the logic embedded in the `visualization_scripts`, all testing methodologies here utilize strict **Evaluation Parity**. Every calculation rigorously clips anomalous sensor footprints, aligns abstract model predictions into physical constraints dynamically, and strictly partitions error scores into segmented distance bins logically corresponding to environmental navigation requirements.

## Core Evaluation Mechanics

### Multi-Protocol Alignments
Models outputting unscaled relative depth (e.g. DepthAnything equivalents) inherently project metric-ambiguous maps. To compute precise empirical bounds, these scripts employ **Least-Squares Scale/Shift Alignments** against the ground-truth directly, converting unbounded matrix outputs logically back into their corresponding physical spaces strictly prior to error execution.

### Error Computations
The metrics computed correspond to standardized regression scores natively established in single-image depth estimation, mapping variables internally into `json` outputs:
- **Classic**: RMSE, absolute relative error (AbsRel), `δ` thresholds (`δ<1.25^n`), invariant logarithmic errors.
- **Distance Bins**: Splicing results explicitly testing near/medium/far topographical ranges internally (e.g., `0-5m`, `5-15m`, `>15m`).
- **Shaded Extrapolations (Dark Masks)**: Certain frameworks dynamically bypass shadows natively projecting metric bounds. Dark masks are explicitly constructed utilizing morphological operations (`cv2.MORPH_ELLIPSE`) against RGB sources logically omitting shadows from final structural evaluation penalties.
- **Semantic Limits** (LuSNAR ONLY): Partitions rocks, craters, and regolith into distinct classes mathematically tracking feature capabilities specifically.

## Core Scripts

These scripts are mapped systematically out to individual benchmarks intrinsically scaling constraints organically:

### 1. `eval_change.py`
Evaluates Chang'e rover mappings (MAX 25.0m).

### 2. `eval_cheri.py`
Evaluates Cheri mappings specifically capping limits tightly around narrow environments (MAX 17.0m).

### 3. `eval_etna.py`
Executes volcanic testing bounds scaling naturally against drone projections (MAX 15.0m).

### 4. `eval_lunarsim.py`
Matches LunarSim synthetic ground-truths intrinsically utilizing continuous PNG matrices lacking strict max cutoffs perfectly natively mapping distributions directly.

### 5. `eval_lusnar.py`
Evaluates high-fidelity PFM matrices. Explicitly handles massive subfolder sequence pairing bounds mapping dynamically out to `semantic` and extensive bins (MAX 50.0m). 

### 6. `eval_s3li.py`
Tests physical terrestrial analogs inherently linking sequences like `crater_inout` logically against composite predictions natively managing metric variations (MAX 30.0m).

## Example Execution

All systems distribute logic implicitly mapping across isolated spawned pools, aggressively optimizing hardware I/O constraints organically completely bypassing Python `GIL` bounds logically:

```bash
python eval_s3li.py \
    --pred-dir /path/to/inferences/s3li \
    --output-dir /path/to/results/s3li \
    --take-inverse \
    --workers 16
```

## Standalone Virtual Environment

This module is completely decoupled and is designed to operate strictly within its own isolated architectural state. 
To prevent cross-contamination with other system dependencies, it is highly recommended to instantiate a dedicated virtual environment solely for this directory.

```bash
# 1. Initialize localized environment mapped directly to this execution boundary
python3 -m venv venv

# 2. Activate isolated boundaries explicitly 
source venv/bin/activate

# 3. Pull required foundational packages natively 
pip install --upgrade pip
pip install numpy opencv-python
```
