# Annealed Implicit Q-learning

This code is confidential and intended for peer review purposes only.

SAC is based on https://github.com/proceduralia/high_replay_ratio_continuous_control

TD3 is based on https://github.com/sfujim/TD3


## RUN
SAC-based AIQL
```
cd sac
python train_parallel.py --env_name hopper-hop --iql_loss --iql_tau 0.9 
```

TD3-based AIQL
```
cd td3
python main.py --env hopper-hop --iql_tau 0.9 --iql_anneal
```
