# Revisiting Design Choices in Offline Model Based Reinforcement Learning

To run the BO loop on the D4RL environments, for example on the hopper mixed environment.

First, we need to train and save 3x models on the offline dataset. These are not included with the code due to the large size of the saved models. This is easiest done by running:
`python train.py --yaml_file args_yml/hopper-v2-d4rl-mopo.yml --seed 0 --uuid hopper_mixed_models`
and retrieving the output from `data/`. Repeat for a few seeds. The flag `save_model` must be True.

To then BO with 250 offline epochs of evaluation per iteration, run:
`python localglobal/main.py --offline_rl_yaml args_yml/bo_test_rig_hoppermixed.yml --offline_rl_epochs 250 --global_bo`
The flags `ensemble_replace_model_dirs` and `load_model_dir` in `args_yml/bo_test_rig_hoppermixed.yml` will need to be redirected to the directory of the saved models.

A Dockerfile is also provided.
