This code is largely based off of Rllab:
https://github.com/rll/rllab

However the meta-RL modifications for multiple replays buffers and meta-RL training is largely inspired by Rakelly et. al. (2019) http://arxiv.org/abs/1903.08254
from the PEARL algorithm codebase:
https://github.com/katerakelly/oyster

# Installation MujoCo

To run this code, please follow the directions for getting started with rllab and pearl code. Specifically we repeat some important imformation taken from the instructions from rllab and the Pearl algorithm codebase.

You will need to first install MuJoCo. For the task distributions in which the reward function varies (Cheetah, Ant, Humanoid), install MuJoCo200. Set LD_LIBRARY_PATH to point to both the MuJoCo binaries (/$HOME/.mujoco/mujoco200/bin) as well as the gpu drivers (something like /usr/lib/nvidia-390, you can find your version by running nvidia-smi).

For the task distributions where different tasks correspond to different model parameters (Walker), MuJoCo131 is required. Simply install it the same way as MuJoCo200. 


# Dependencies
For all other dependencies to install locally please run (using miniconda):
conda env create -f dependencies/environment.yml

If the above fails, note that this was tested only on Ubuntu 16.04 and MacOS Catalina.

# Running Experiment
To run an experiment, simply run

python launch_experiment.py ./configs/[EXPERIMENT].json

where [EXPERIMENT] is the configuration name for the environment. 
For example, running the cheetah forward backward experiment would be:
python launch_experiment.py ./configs/cheetahdir-forward-back.json



The out-of-distirbution environment tasks are from MIER algorithm codebase (Mendonca et. al. 2020)
https://github.com/russellmendonca/mier_public/


# Note:
This codebase was trained on a compute cluster using multiple GPUs
