# Implementation of iterated Shared Conservative Q-Learning (iS-CQL)

## User installation
We recommend using Python 3.11.5. In the folder where the code is, create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:
```bash
python3 -m venv env
source env/bin/activate
pip install --upgrade pip setuptools wheel
pip install -e .[dev,gpu]
```
To verify the installation, run the tests as:```pytest```

## Running experiments
First, the dataset for the game of interest needs to be downloaded from https://console.cloud.google.com/storage/browser/rl_unplugged.

Then, the dataset needs to be converted to our replay buffer format using `launch_job/atari/launch_data_prep.sh`. Please change "YOUR_PATH" to the correct location.

Finally, the script `launch_job/atari/launch.sh` trains an iS-CQL (K=9) agent with the IMPALA architecture and LayerNorm on a local machine, on the game Asterix. Please change "YOUR_PATH" to the correct location.