# Relaxed State-Adversarial Offline Reinforcement Learning: A Leap Towards Robust Model-Free Policies from Historical Data


## Dependencies 
To set up a python environment (with dev-tools of your taste, in our workflow, we use conda and python 3.8), just install all the requirements:

```commandline
python3 install -r requirements.txt
```

However, in this setup, you must install mujoco210 binaries by hand. Sometimes this is not super straightforward, but this recipe can help:
```commandline
mkdir -p /root/.mujoco \
    && wget https://mujoco.org/download/mujoco210-linux-x86_64.tar.gz -O mujoco.tar.gz \
    && tar -xf mujoco.tar.gz -C /root/.mujoco \
    && rm mujoco.tar.gz
export LD_LIBRARY_PATH=/root/.mujoco/mujoco210/bin:${LD_LIBRARY_PATH}
```
You may also need to install additional dependencies for mujoco_py. 
We recommend following the official guide from [mujoco_py](https://github.com/openai/mujoco-py).


## How to reproduce experiments

### Training

Configs for the main experiments are stored in the `configs/rebrac/<task_type>` 

All available hyperparameters are listed in the `src/algorithms/rebrac_rappo.py` 

For example, to start ReBRAC training process with D4RL `halfcheetah-medium-v2` dataset, run the following:
```commandline
PYTHONPATH=. python3 src/algorithms/rebrac_rappo.py --config_path="configs/rebrac/halfcheetah/halfcheetah_medium.yaml"
```

