# Latent Policy Barrier

This repo is a working copy of the code to NeurIPS submission #11411.

## Installation
Install conda environment
```console
$ mamba env create -f conda_environment.yaml
```

## Reproducing Simulation Benchmark Results 
### Download Expert Demonstration Data
We provide expert demonstration data for four simulation tasks used in our benchmark. You can download them from the links below:

- **Push-T**: [Download from Google Drive](https://drive.google.com/drive/folders/1BmADkgF5OtvFekrP6d_LIGbetL86Nakd?usp=sharing)
- **Square**: [Download from Google Drive](https://drive.google.com/file/d/1xOzVlNHD-uH17Nv1aUgb-T2dbV9zZbtc/view?usp=sharing)
- **Tool-Hang**: [Download from Google Drive](https://drive.google.com/file/d/1KAb8PhuyjbUkqYEWT57dVjdYsVllBLAn/view?usp=sharing)
- **Transport**: [Download from Google Drive](https://drive.google.com/file/d/1BumHl9AQ7gJ9q8BwL3sAeWQSBj0yx3Ip/view?usp=sharing)

Each archive contains observations and actions in a structured format compatible with our training and inference pipeline.

### Running Inference with Action Optimization

We provide policy and dynamics model checkpoints for reproducing the test-time optimization experiments. You can download the checkpoints for each task from the links below:

- **Push-T**
  - [Policy Checkpoint](https://drive.google.com/file/d/1KE20U1JaiWMKemqcpGTpUaoKGOvinNkC/view?usp=sharing)
  - [Dynamics Model Checkpoint](https://drive.google.com/file/d/1lnfWRq7L1ohDpTLXjqiqg8Dg998hVFhb/view?usp=sharing)

- **Square**
  - [Policy Checkpoint](https://drive.google.com/file/d/1OM1hgSw68G2o2OFjcqdBiHXSXsnIYOMm/view?usp=sharing)
  - [Dynamics Model Checkpoint](https://drive.google.com/file/d/14rgMBsWxywEAKQw9wvdhBIKWDHRnBnAY/view?usp=sharing)

- **Tool-Hang**
  - [Policy Checkpoint](https://drive.google.com/file/d/1fXclNZRiOC3ow9ONlGdhFtEUDO1nUgIi/view?usp=sharing)
  - [Dynamics Model Checkpoint](https://drive.google.com/file/d/1gFe95AEPWlssxCgQ3JMDLh0At-PMr1Vl/view?usp=sharing)

- **Transport**
  - [Policy Checkpoint](https://drive.google.com/file/d/1m0GrqhoPPTxV4LbrbzH89B_InUNm1fD7/view?usp=sharing)
  - [Dynamics Model Checkpoint](https://drive.google.com/file/d/15ihxt2csfbTPyJTV8q9-lGewTEu-16lJ/view?usp=sharing)


After downloading the expert demonstration data, policy checkpoints, and dynamics model checkpoints:

- Set the path to the **expert demonstration data** in the `get_demo_latents()` function inside [`dino_wm/planner.py`](dino_wm/planner.py).
- Set the paths to the **policy** and **dynamics model checkpoints** in `eval_config.yaml`.

Then, run inference with:

```console
(lpb)[lpb_submission_code]$ python eval_robomimic_hydra.py
```

### Training a Base Diffusion Policy
Activate conda environment and login to [wandb](https://wandb.ai)
After setting the path to the expert demonstrations, you can start the base policy training. For instance, to train a base policy for the Square task, run
```console
(lpb)[lpb_submission_code]$ python train.py --config-dir=. --config-name=image_square_diffusion_policy_cnn.yaml training.seed=42 training.device=cuda:0 hydra.run.dir='data/outputs/${now:%Y.%m.%d}/${now:%H.%M.%S}_${name}_${task_name}'
```

### Training a Dynamics Model
`train.py` lets you save policy checkpoints at desired intervals. You can rollout the saved checkpoints to generate additional rollout dataset. After aggregating them with the expert demonstrations, you can train a dynamics model on the combined dataset. After setting the path to the combined dataset and policy checkpoint in `dino_wm/conf/train.yaml`, you can start the dynamics model training: 
```console
(lpb)[lpb_submission_code]$ python dino_wm/train.py --config-name train.yaml frameskip=15 num_hist=1 num_pred=1 action_emb_dim=240
```