# SKETCH: Semantic Key-Point Conditioning for Long-Horizon Vessel Trajectory Prediction

This folder contains the supplementary code for our ICML 2026 submission:
**SKETCH: Semantic Key-Point Conditioning for Long-Horizon Vessel Trajectory Prediction**

We provide two components:
1. **SKETCH / Capacity Large Model** (MiniMind-based, NKP-conditioned, retrieval-augmented).
2. **MP-LSTM baseline re-implementation** used for comparison in the paper.

Each component is self-contained and has its own detailed `README.md`.  
This top-level README serves as a **navigation entry**.



## Folder Structure

SKETCH/
├── README.md                      # Overview + navigation
├── CapacityLargeModel/            # SKETCH / Capacity Large Model (MiniMind + NKP)
│   ├── README.md                  # Detailed usage and notes
│   ├── models/
│   ├── utils/
│   ├── weights_pretrain/
│   ├── weights_sft/
│   ├── enrolled_trajectory.npy
│   └── evaluate_final_dataloader_Public.py
│
├── MP-LSTM/                       # MP-LSTM baseline (re-implementation)
│   ├── README.md                  # Detailed usage and notes
│   ├── LSTM-Public/
│   └── mp_lstm_models2/
│
└── data/                          # Public trajectory data (CSV files)

## What to run?

### A) Run / verify SKETCH (Capacity Large Model)

Please follow:
- `CapacityLargeModel/README.md`
Typical entry point:
```bash
python CapacityLargeModel/evaluate_final_dataloader_Public.py
```
Notes:
* Pretrained and SFT checkpoints are provided under `weights_pretrain/` and `weights_sft/`.
* The enrolled trajectory memory is provided as `enrolled_trajectory.npy`.
* We only provide necessary evaluation-required code, but will provide the full training and inference scripts after acceptance.



### B) MP-LSTM baseline (re-implementation)

Please follow:
* `MP-LSTM/README.md`
This baseline includes the MP-LSTM pipeline (trajectory matching + support point prediction + spline interpolation) and evaluation utilities.



## Large / Intermediate Files (MP-LSTM)

Some MP-LSTM intermediate artifacts required by inference (e.g., `trajectory_database_288.pkl`, `test_trajectories.pkl`) are **not included** due to file size constraints.
They can be generated by the MP-LSTM preprocessing scripts as documented in:
* `MP-LSTM/README.md`



## Code Availability Notes (Capacity Large Model)

For the Capacity Large Model, standalone training and standalone inference scripts are **not included** in this release.
The provided code and resources include:
* full model definitions (MiniMind variants),
* released checkpoints,
* enrolled trajectory memory,
* evaluation script and qualitative examples,
which are sufficient for result verification and analysis as presented in the paper.
