# README

This repository provides instructions for setting up the environment, tokenizing training and test data, and evaluating model checkpoints.

## Data and Model Availability

Due to the anonymous review policy of ICLR 2026, we are unable to release the full datasets and model checkpoints at this time. **All data and models will be made publicly available after the review period concludes.**

To facilitate reproducibility and allow reviewers to test the pipeline, we include several **dummy samples** under the `dataset/` directory. These samples are compatible with `tokenize_data.py` and can be used to verify that tokenization and evaluation processes function as intended. The code is structured to support easy integration with the data/models once they are available.

##  Environment Setup

To set up the environment, run the following:

```
conda create -n your_env python=3.12.3
conda activate your_env
chmod +x ./scripts/install.sh
./scripts/install.sh
```

## Tokenization

We provide default configurations for generating training and evaluation data.
- To generate data for VPFT and VPRL Stage 2 (optimal trajectories), run:

```
python tokenize_data.py +dataset=frozen_lake_optimal
```

- To generate data for VPRL Stage 1 (random exploration), run:


```
python tokenize_data.py +dataset=frozen_lake_random
```

## Evaluation

After training, models can be evaluated using the `eval.py` script. There are two modes, depending on whether you are using a merged model or a base model with a LoRA adapter.

### Evaluate a base model with a LoRA adapter

```
python eval.py \
  model_path=your_base_model_path_before_merge \
  lora_path=your_lora_path \
  evaluation_result_folder_pth=folder_path_to_store_results
```

### Evaluate a merged model

```
python eval.py \
  model_path=merged_model_path \
  evaluation_result_folder_pth=folder_path_to_store_results
```

## Display Evaluation Statistics
To print summary statistics after evaluation:

```
python eval.py \
  evaluation_result_folder_pth=folder_path_to_store_results \
  show_stats=True
```

Or simply run with the default configuration:

```
python eval.py show_stats=True
```