# The Three Regimes of Offline-to-Online Reinforcement Learning

This repository contains code for the experiments of our work.
We include local copies of `mj_envs` and `D4RL` under `dependencies/` to ease reproduction. These are downloaded from the open‑source projects and include minor bug fixes.
## Getting started


### Environment setup
```
conda create -n o2o python=3.10 -y
conda activate o2o
pip install -r requirements.txt

pip install --upgrade "jax[cuda11_pip]==0.4.20" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
# If you want to execute multiple runs with a single GPU, we recommend to set this variable.
export XLA_PYTHON_CLIENT_PREALLOCATE=false
```
install D4RL:
```
cd dependencies/d4rl
pip install -e .
```

To use MuJoCo, you also need to install MuJoCo manually to `~/.mujoco/` and set:
```bash
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco210/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia
```

To use the Adroit environments, install the following:
```
cd dependencies/mj_envs
pip install -e .
```

Download the Adroit dataset from [link](https://drive.google.com/file/d/1yUdJnGgYit94X_AvV6JJP5Y3Lx2JF30Y/view) and unzip the files into `~/adroit_data/`.

## Example usage
Offline pretraining a Cal-QL agent:
```
ENV_NAME=kitchen-mixed-v0 SEED=1 bash run_calql_offline.sh
```

Online fine-tuning a SAC agent using Cal-QL's pretrained parameters:
```
ENV_NAME=kitchen-mixed-v0 SEED=1 bash run_sac_finetune.sh
```