# RL³: Boosting Meta Reinforcement Learning via RL inside RL². 

Source code for the paper [RL³: Boosting Meta Reinforcement Learning via RL inside RL²]


## Installation

Install Julia version 1.10.x from https://julialang.org/downloads/

In the root folder of this project:

```bash
julia --project=. -e 'using Pkg; Pkg.instantiate()'
```

## Running Code

### Scripts

```bash
# Train RL^2 for Bandits H=512 for 5000 episodes
julia --project -t auto main.jl bandits rl2 3000 --suffix iclr --seed 0 --nactions 5 --horizon 512 --inference_device cpu --include_time_context concat --batch_size 32768 --minibatch_size 4096 --ent_bonus 0.1 --decay_ent_bonus --lr 0.0003 --progressmeter

# Train RL^2 for MDPs H=512 for 10000 episodes
julia --project -t auto main.jl mdps rl2 10000 --suffix iclr --seed 0 --nstates=10 --nactions 5 --horizon 512 --inference_device cpu --include_time_context concat --batch_size 32768 --minibatch_size 4096 --ent_bonus 0.1 --decay_ent_bonus --lr 0.0003 --progressmeter

# Train RL^2 for Gridworld 11x11 H=256 for 20000 episodes
julia --project -t auto main.jl griworlds rl2 20000 --suffix iclr --grid_variation 11x11 --seed 0 --horizon 256 --include_time_context concat --batch_size 32768 --minibatch_size 4096 --ent_bonus 0.1 --decay_ent_bonus --lr 0.0002 --progressmeter
```

You can specify 'rl3' instead of 'rl2' after the problem name to train RL^3 in the above commands.