This codebase provides an implementation of all the 5 algorithms for which scores were reported in the main submission. This implementation is built on top of an existing, publicly available vanilla-SimPLe implementation : https://github.com/tensorflow/tensor2tensor (also cited in the paper)

Dependencies :
The dependencies are mentioned in requirements.txt
Important note :
Please make sure that CUDA 10.0 and cuDNN version >= 7.6.5 are installed. The code may seem to run, but the agent will not learn if tensorflow runs on other CUDA/cuDNN versions.

The experiments were performed with the following system specifications :
Python3 (Version = 3.6.4 or 3.7.4)
Ubuntu 18.04
This codebase is written in python3. 


Please follow the following commands to install the dependencies and run the code in a  Linux based environment (Preferably Ubuntu 18.04)
It is preferable to set up a new virtual environment using virtualenv / conda before installing the dependencies and running the agents to ensure a clean installation.


Create and activate a new environment

    conda create --name evade python=3.6
    conda activate evade

Install the correct versions of CUDA and cuDNN

    conda install cudatoolkit=10.0
    conda install cudnn


Check your installation with the following commands :

    conda list cudnn 
    conda list cuda
These should show version 10.0.130 for cudatoolkit and >=7.6.5 for cudnn

After installing CUDA dependencies, to install the required packages run :
    ./install.sh

Training Agents : 

There are 5 training algorithms included in this code base. The algorithm used can be changed using change_agent.sh. 

To change the training algorithm to SimPLe(30) use the command : 
	./change_agent.sh SimPLe_30

To change the training algorithm to EVaDE-SimPLe use the command : 
	./change_agent.sh EVaDE_SimPLe

To change the training algorithm to SimPLe(30) equipped with only the noisy event interaction layers use the command : 
	./change_agent.sh interaction_layer

To change the training algorithm to SimPLe(30) equipped with only the noisy event weighting layers use the command : 
	./change_agent.sh translation_layer

To change the training algorithm to SimPLe(30) equipped with only the noisy event translation layers use the command : 
	./change_agent.sh weighting_layer

By default the ./install.sh command installs the EVaDE-SimPLe agent

After changing the training algorithm, the command to train each agent is mentioned below : 
Please note that these commands should be run from the main directory, i.e., the directory containing install.sh

Command to train a SimPLe_30 agent :
    	python -m tensor2tensor.rl.trainer_model_based -output_dir=<output dir>  --loop_hparams_set=rlmb_base --loop_hparams=game=<atari game name> -d_id <gpu id>;

    	Example command :
    	python -m tensor2tensor.rl.trainer_model_based -output_dir=BankHeist_simple_30_run1  --loop_hparams_set=rlmb_base --loop_hparams=game=BankHeist -d_id 0;

Command to train a EVaDE-SimPLe  agent:
	python -m tensor2tensor.rl.trainer_model_based_particle -output_dir=<output dir> --loop_hparams_set=rlmb_base --loop_hparams=game=<atari game name> -d_id <gpu id>;
    	
    	Example command :
	python -m tensor2tensor.rl.trainer_model_based_particle -output_dir=BankHeist_evade_simple_run1 --loop_hparams_set=rlmb_base --loop_hparams=game=BankHeist -d_id 0;

Command to train a SimPLe_30 agent equipped with only the noisy event interaction layers:
	python -m tensor2tensor.rl.trainer_model_based_particle -output_dir=<output dir> --loop_hparams_set=rlmb_base --loop_hparams=game=<atari game name> -d_id <gpu id>;
    	
    	Example command :
	python -m tensor2tensor.rl.trainer_model_based_particle -output_dir=BankHeist_simple_inter_run1 --loop_hparams_set=rlmb_base --loop_hparams=game=BankHeist -d_id 0;


Command to train a SimPLe_30 agent equipped with only the noisy event weighting layers:
	python -m tensor2tensor.rl.trainer_model_based_particle -output_dir=<output dir> --loop_hparams_set=rlmb_base --loop_hparams=game=<atari game name> -d_id <gpu id>;
    	
    	Example command :
	python -m tensor2tensor.rl.trainer_model_based_particle -output_dir=BankHeist_simple_weight_run1 --loop_hparams_set=rlmb_base --loop_hparams=game=BankHeist -d_id 0;

Command to train a SimPLe_30 agent equipped with only the noisy event translation layers:
	python -m tensor2tensor.rl.trainer_model_based_particle -output_dir=<output dir> --loop_hparams_set=rlmb_base --loop_hparams=game=<atari game name> -d_id <gpu id>;
    	
    	Example command :
	python -m tensor2tensor.rl.trainer_model_based_particle -output_dir=BankHeist_simple_trans_run1 --loop_hparams_set=rlmb_base --loop_hparams=game=BankHeist -d_id 0;
	

For every trained agent, <output dir>/eval_metrics will contain a tensorboard events file that can be used to track the agent's learning progress. The score achieved at the 30th iteration is to be considered as the final score of the agent. This is given by the mean_reward/eval/sampling_temp_1.0_max_noops_8_unclipped metric step number 29 in this tensorboard events file.

Useful notes : 

- All the experiments were performed on a machine with no TPUs. The codebase on which our code is built upon also supports TPUs.  We, however, suggest training only using GPUs, as that mimics the training conditions using which the scores were reported.

  

