Abstract: Autonomous Agents trained with Reinforcement Learning (RL) must explore the effects of their actions in different environment states to learn optimal control policies or build a model of such environment. Exploration may be impractical in complex environments, hence ways to prune the exploration space must be found. In this paper, we propose to augment an autonomous agent with a causal model of the core dynamics of its environment, learnt on a simplified version of it and then used as a “driving assistant” for larger or more complex environments. Experiments with different RL algorithms, in increasingly complex environments, and with different exploration strategies, show that learning such a model improves the agent behaviour.
Loading