## Installation
To install locally, you will need to first install [MuJoCo](https://www.roboti.us/index.html). For task distributions in which the reward function varies (Cheetah, Ant), install MuJoCo150 or plus. Set `LD_LIBRARY_PATH` to point to both the MuJoCo binaries (`/$HOME/.mujoco/mujoco200/bin`) as well as the gpu drivers (something like `/usr/lib/nvidia-390`, you can find your version by running `nvidia-smi`).

For the remaining dependencies, create conda environment by
```
conda env create -f environment.yaml
```

<!-- For task distributions where the transition function (dynamics)  varies  -->

**For Walker environments**, MuJoCo131 is required.
Simply install it the same way as MuJoCo200. To swtch between different MuJoCo versions:

```
export MUJOCO_PY_MJPRO_PATH=~/.mujoco/mjpro${VERSION_NUM}
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mjpro${VERSION_NUM}/bin
```

The environments make use of the module `rand_param_envs` which is submoduled in this repository. Add the module to your python path, `export PYTHONPATH=./rand_param_envs:$PYTHONPATH` (Check out [direnv](https://direnv.net/) for handy directory-dependent path managenement.)


This installation has been tested only on 64-bit CentOS 7.2. The whole pipeline consists of two stages: **data generation** and **Offline RL experiments**:

## Data Generation

FOCAL++ requires fixed data (batch) for meta-training and meta-testing, which are generated by trained [SAC](https://arxiv.org/pdf/1801.01290.pdf) behavior policies. Experiments at this stage are configured via `train.yaml` located in `./rlkit/torch/sac/pytorch_sac/config/`.  

Example of training policies and generating trajectories on multiple tasks:

```
python policy_train.py --gpu 0
```

Generate trajectories from pretrained models

```
python policy_train.py --eval
```

Generated data will be saved in `./data/`

## Offline RL Experiments
Experiments are configured via `json` configuration files located in `./configs`. Basic settings are defined and described in `./configs/default.py`. To reproduce an experiment, run: 
```
python launch_experiment.py ./configs/[EXP].json
```
By default the code will use the GPU - to use CPU instead, set `use_gpu=False` in the corresponding config file.

Output files will be written to `./output/[ENV]/[EXP NAME]` where the experiment name corresponds to the process starting time. The file `progress.csv` contains statistics logged over the course of training. `data_epoch_[EPOCH].csv` contains embedding vector statistics. We recommend `viskit` for visualizing learning curves: https://github.com/vitchyr/viskit. Network weights are also snapshotted during training.

Example of running experiment on Sparse-Point-Robot(relabeled) environment:


- download [sparse point robot relabeled data](https://drive.google.com/file/d/1YQfTPwKuZvL1ITi9Zw8mf5fXbZ6Ip2h6/view?usp=sharing) and unzip the data to `./data/`
- edit sparse_point_robot_relabeled.json to add data_dir=`./data/sparse_point_robot_relabeled`
- run python launch_experiment.py ./configs/sparse_point_robot_relabeled.json





