This folder contains the code for running the experiments for the MinAtar and Brax environments.

# Overview

The code is organized into the following files and directories:
- `a2c_continuous.py`: Contains the code for JAX implementation of A2C for continuous action spaces.
- `a2c_discrete.py`: Contains the code for JAX implementation of A2C for discrete action spaces.
- `evarl_experiments.sh`: The script for running all the EvA-RL experiments.
- `generate_continuous_action_offline_data_evarl_pred_transformer.py`: Contains the code for training A2C agents and collecting offline trajectories from these agents. In addition, it also contains the code for training a value prediction transformer model on collected trajectories.
- `generate_discrete_action_offline_data_evarl_pred_transformer.py`: This is a discrete action version of the above script.
- `ope_continuous.py`: Contains the code for running off-policy evaluation using FQE, Doubly Robust, and PDIS.
- `ope_discrete.py`: This is a discrete action version of the above script.
- `train_evarl_continuous.py`: Contains the code for training an EvA-RL agent for continuous action spaces.
- `train_evarl_discrete.py`: This is a discrete action version of the above script.
- `train_scripts`: Contains the scripts for running the experiments.
- `utils.py`: Contains the utility functions for the code.
- `wrappers.py`: Contains the wrappers for the JAX RL environments.

# Running the experiments

Before running the experiments, please set up your wandb entity in the appropriate places in the code. To run the experiments, please run the following command:

```
bash evarl_experiments.sh
```

This will create the scripts, appropriate folders for the output logs, run the experiments and finally plot the results depicted in the paper.