# LEASE: Offline Preference-based Reinforcement Learning with High Sample Efficiency


Code for LEASE: Offline Preference-based Reinforcement Learning with High Sample Efficiency. The offline preference dataset comes from [Uni-RLHF:](https://github.com/pickxiguapi/clean-offline-rlhf).

## Installation

To install all the required dependencies:

1. Install the MuJoCo 2.1.0 engine, which can be downloaded from [MuJoCo 2.1.0](https://mujoco.org/).

2. Install Python packages listed in `LEASE.yml`.
   ```
   conda env create -f LEASE.yml
   conda activate LEASE
   ```
   
## Usage

Just run `run_iql.py` or `run_cql.py` with specifying the task name. The hyperparameters are automatically loaded from `configs`.

```bash
python run_iql.py --task [task name]
```
'task name': e.g., walker2d-medium-expert-v2

