
# Our Algorithm - Based on FamO2O Framework

Our algorithm is built upon the [FamO2O framework](https://github.com/LeapLabTHU/FamO2O), with both offline agents, CQL (Conservative Q-Learning) and IQL (Implicit Q-Learning), implemented in the JAX version.

## Setup

1. Clone this repository and install the required dependencies by following the instructions from FamO2O's original repository [here](https://github.com/LeapLabTHU/FamO2O).

## Step-by-Step Instructions

### 1. Train Behavior Cloning Model

The first step is to train a behavior cloning (BC) model. To do this:
- Navigate to the `jax_bc` folder.
- Run the following command:
  ```bash
  python train_bc.py
  ```

### 2. Pretrain an Offline Model

Next, you need to pretrain an offline model using FamO2O's instructions. Follow the instructions provided in the FamO2O repository for pretraining and saving your offline model.

### 3. Run Our CQL Version

Once the offline model is saved, you can proceed to run our modified version of CQL:
- Navigate to the `jax_our_cql` folder.
- Run the following command:
  ```bash
  python train_our_cql.py
  ```

### 4. Run Our IQL Version

Alternatively, you can run our modified version of IQL:
- Navigate to the `jax_our_iql` folder.
- Run the following command:
  ```bash
  python train_our_iql.py
  ```

