## Code for the ICML 2026 Submission "On the Sample Efficiency of Inverse Dynamics Models for Semi-Supervised Imitation Learning"

Original code from LAPO [“Learning to Act without Actions”](https://arxiv.org/abs/2312.10812). This is an anonymized fork.

The main changes for the LAPO+ algorithm can be found in `lapo/train_lapo_plus_stage2.py` and `lapo/train_lapo_plus_stage3.py`.


## Setup instructions

We recommend using `python==3.11`. If you want to use an older Python version, you need to replace the `procgen-mirror` package in `requirements.txt` with `procgen`.
To install dependencies run:

```bash
pip install -r requirements.txt
```

### Dataset setup
We included a subsample of the `miner` dataset.


## LAPO+ (with reduced steps)

```bash
cd lapo
```

Stage 1.
```bash
WANDB_MODE=disabled python train_lapo_stage1.py exp_name=test env_name=miner lapo_stage1.steps=1000
```
Stage 2.
```bash
WANDB_MODE=disabled python train_lapo_plus_stage2.py env_name=miner exp_name=test lapo_plus_stage2.n_observed_samples=1024 lapo_plus_stage2.freeze_backbone=true lapo_plus_stage2.steps=1000
```
Stage 3. We specify some key stage 2 parameters to load the correct checkpoint.
```bash
WANDB_MODE=disabled python train_lapo_plus_stage3.py env_name=miner exp_name=test lapo_plus_stage2.n_observed_samples=1024 lapo_plus_stage2.freeze_backbone=true lapo_plus_stage3.steps=1000
```

Checkpoints are saved to `lapo/exp_results`.