# AnIsoNet Inference Package

ICML 2026 Submission - Inference Code Only

## Quick Start

### S3DIS Area 5

```bash
python tools/test_s3dis_simple.py \
    --checkpoint checkpoints/s3dis_area5_best.pth \
    --data_root /path/to/S3DIS/Area_5 \
    --output results/s3dis
```

Expected mIoU: 82.62%

**Note**: S3DIS data should contain pre-extracted 256-dim features from the LAGM encoder. If using raw RGB colors, the checkpoint must contain the complete model (encoder + decoder).

### ScanObjectNN

```bash
python tools/test_scanobjectnn_simple.py \
    --checkpoint checkpoints/scanobjectnn_morton_best.pth \
    --data_root /path/to/ScanObjectNN/h5_files \
    --output results/scanobjectnn \
    --batch_size 32
```

Expected Accuracy: 94.21%

**Note**: ScanObjectNN requires the complete model (encoder + decoder) in the checkpoint, as it processes raw xyz coordinates.

## Dataset Download

### S3DIS
- **Official Link**: http://buildingparser.stanford.edu/dataset.html
- **Version**: Stanford3dDataset_v1.2_Aligned_Version
- **Size**: ~6GB
- **For Inference**: Use Area_5 directory

### ScanObjectNN
- **Official Link**: https://hkust-vgd.github.io/scanobjectnn/
- **Direct Download**:
  ```bash
  wget --no-check-certificate https://hkust-vgd.ust.hk/scanobjectnn/h5_files.zip
  unzip h5_files.zip
  ```
- **Size**: ~600MB
- **For Inference**: Use h5_files directory

## Requirements

```
torch>=2.0.0
numpy>=1.24.0
h5py>=3.8.0
tqdm
```

## Core Components

- **GISA Decoder** (`models/gisa_decoder.py`): Global Isotropy Semantic Aggregation
  - Heterogeneous bi-directional scanning: Morton + Identity
  - O(N) linear complexity via GatedDeltaNetBlock

- **Space-filling curves** (`models/scan_utils.py`): Morton/Hilbert encoding

- **Datasets**: S3DIS + ScanObjectNN loaders with preprocessing

## Architecture

```
Input (N, 256) → MLP → DeltaNet (Morton + Identity) → Gated Fusion → Classifier → (N, C)
```

**Morton track**: Captures geometric continuity (anisotropic)
**Identity track**: Preserves semantic structure (isotropic)

## Code Statistics

- Total code: 1104 lines
- GISA Decoder: 277 lines
- Data utilities: 256 lines
- Test scripts: 461 lines

## Submission Notes

This package contains the inference implementation to support reproducibility verification during peer review. We've included the GISA decoder code, which embodies our key contribution: the heterogeneous bi-directional scanning mechanism that decouples spatial serialization from semantic aggregation. The provided checkpoints contain complete trained models, so reviewers can directly validate our reported results without needing the training code.

Following standard academic practice, we'll release the full training pipeline and LAGM encoder implementation once the paper is accepted. This lets us respond to reviewer feedback and finalize the codebase properly before public release. The decoder code here is self-contained and sufficient to understand our method's core innovation.

For questions during review: contact via ICML submission system
