# AdaSAC - random search - grid search comparision

## Installation guidelines
Python $3.11$ is required for installing the dependencies.
```bash
conda create -n adasac python=3.11.5
conda activate adasac
conda install -c nvidia cuda-nvcc=12.3.52

python -m pip install -e .
python -m pip install --upgrade "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
```

## Code structure
Each folder has a different purpose:
- The _experiments_ folder contains the skeleton of the SAC algorithm (experiments/SAC.py).
- The _launch\__job_ folder contains the scripts to launch the experiments. Two modes are available, local runs via the files named launch_job/local_run.sh and remote runs via the files named launch_job/cluster_run.sh. The local runs are launched in a tmux terminal, the remote runs rely on Slurm.
- The _sbx_ folder contains the backbone of SAC's skeleton. The biggest change coming from AdaSAC is implemented in the sbx/common/policies.py file, where the class _AutoVectorCritic_ and _AutoVectorOptimizer_ are responsible for training the networks with different hyperparameters in parallel. The proposed selection strategy of the target network is implemented in the function _selected_idx_update_ of the file sbx/sac/sac.py.

## Running scripts
Running the script launch_job/local_run.sh trains locally an AdaSAC agent for $5 \times 10^{5}$ environment steps and $1$ seed on the Hopper environment. This script was used to generate Figures 4, 5, 10 (right), 12, 13, 14, 15, 16, 17, 18, 19 and 20.

## Acknowledgements
This project is based on the [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3) in Jax repository.
