# Joint Self-Supervised Learning for Vision-based Reinforcement Learning (JS2RL)

## Installation
We assume you have access to a gpu that can run CUDA 11.0.
The dependencies are in the `conda_env.yaml` file. And you have to additionally install mujoco-py, dm_control, hydra@0.11_branch using git.


## Create Simple Distractor background video(create 100 ideal-gas videos)
### python distractor/render_n_body_problem_envs.py


## Create Natural Video background( The Kinetics human action video dataset,  http://arxiv.org/abs/1705.06950 )
### Download Kinetics 400, driving car category.


## The parameters required for learning can change on `config_js2rl.yaml`


## To train a JS2RL agent on the `walker walk` Default background --> Default background(test) (Data Efficiency evalutation)
```
python train_js2rl_default.py env=walker_walk batch_size=128 action_repeat=2
```


## To train a JS2RL agent on the `walker walk` Simple Distractor background --> Simple Distractor background(test) (Data Efficiency evalutation)
```
python train_js2rl.py env=walker_walk batch_size=128 action_repeat=2 train_resource_files=your_simple_distractor_path/*.mp4 eval_resource_files=your_simple_distractor_path/*.mp4
```


## To train a JS2RL agent on the `walker walk` Natural Video background --> Natural Video background(test) (Data Efficiency evalutation)
```
python train_js2rl.py env=walker_walk batch_size=128 action_repeat=2 train_resource_files=your_natural_video_path/*.mp4 eval_resource_files=your_natural_video_path/*.mp4
```


## To train a JS2RL agent on the `walker walk` Simple Distractor background --> Natural video background(test) (Generalization evalutation)
```
python train_js2rl.py env=walker_walk batch_size=128 action_repeat=2 train_resource_files=your_simple_distractor_path/*.mp4 eval_resource_files=your_natural_video_path/*.mp4
```


### Train
In your console, you should see printouts that look like this:
```
| train | E: 1 | S: 500 | D: 1.0 s | R: 1.0000 | BR: 0.1 | A_LOSS: -1 | CLOSS: 2.3 | TLOSS: 1.0 | TVAL: 1.0 | AENT: 1.0
```

### Test
```
| eval | E: 10 | S: 5000 | R: 10.0
```
