The code is written in Python.

From the general python package sanity perspective, it is a good idea to use conda environments to make sure packages from different projects do not interfere with each other.

To create a conda env with python3, one runs: conda create -n test python=3.6
To activate the env run: conda activate test

Then you need to install the packages required by this code:

pip install numpy==1.16.3
pip install tensorflow==1.13.1
pip install tensorflow-probability==0.6.0
pip install opencv-python
pip install cloudpickle
pip install gym
pip install matplotlib  

Then you are free to run main.py to train agents. Hyperparameters for training L-REINFORCE in Cartpole are ready to run by default. If you would like to test other algorithms, please open variant.py and choose corresponding 'algorithm_name'.

