# PMER + MBPO Reproduction
This repo is adopted and modified from 

>1. Unstable Baselines (link removed for anonymous review)

>2. When to Trust Your Model: Model-Based Policy Optimization (MBPO) (link removed for anonymous review)

We only modify the repo by adding PMER module with no modification on the hyper-parameters or structure of MBPO.


## Instruction
Please make sure the project is running under `YOUR_PATH/PMER_repo` and activate `PMER` env.

To get a quick start, you can test PMER+MBPO on `InvertedPendulum-v2` using:

>PMER + MBPO
```bash
python3 ./unstable_baselines/model_based_rl/mbpo/main_prior.py \
        unstable_baselines/model_based_rl/mbpo/configs/InvertedPendulum-v2.py \
        --prior_ratio 0.05 \
        --gpu 0
```

>MBPO
```bash
python3 ./unstable_baselines/model_based_rl/mbpo/main_prior.py \
        unstable_baselines/model_based_rl/mbpo/configs/InvertedPendulum-v2.py \
        --prior_ratio 0. \
        --gpu 0
```

The results are stored in `logs/mbpo/InvertedPendulum-v2`. This takes around 5 mins.
Similarly,

>Hopper task

```bash
python3 unstable_baselines/model_based_rl/mbpo/main_prior.py \
        unstable_baselines/model_based_rl/mbpo/configs/Hopper-v3.py \
        --prior_ratio 0.3 \
        --gpu 0
```

```bash
python3 unstable_baselines/model_based_rl/mbpo/main_prior.py \
        unstable_baselines/model_based_rl/mbpo/configs/Hopper-v3.py \
        --prior_ratio 0. \
        --gpu 0
```

