Description of the files:
    - main.py launch the tests with different horizons for the Goal-Conditioned, Proximal Policy Optimization and Q-Learning methods
    - agents.py implementation of the goal-conditioned, Proximal Policy Optimization and Q-Learning methods
    - planning_agent.py implementation and tests of the planning method based on learning the forward dynamics and planning with an off-the-shelf-solver
    - alphazero_agent.py and tree_search.py implement and test the AlphaZero method
    - models.py simple Deep Learning architecture (MLP) implementation
    - environment.py implementation of the family of problems used in the proof of our theoretical result into an environment
    - optimize_params.py code for the optimization of the hyperparameter of the methods
