# Bayesian Offline-to-Online Reinforcement Learning : A Realist Approach

## Usage 
Paper results were collected with [Mujoco 1.50](http://www.mujoco.org/) (and [mujoco-py 1.50.1.1](https://github.com/openai/mujoco-py)) in [OpenAI gym 0.17.0](https://github.com/openai/gym) with the [D4RL datasets](https://github.com/rail-berkeley/d4rl). Networks are trained using [PyTorch 1.4.0](https://github.com/pytorch/pytorch) and Python 3.6.

## Offline Training
```
python3 main.py --save_model --env hopper-medium-v2
```

## Online Finetuning
```
python3 finetune.py --load_model --offdataset --env hopper-medium-v2
```