Keywords: Maximum-Entropy Reinforcement Learning, Robustness, Complexity Measures, Flat Minima, Fisher Information, Regularisation
TL;DR: This work shows links between policy robustness for Maximum Entropy RL and complexity measures borrowed from statistical learning theory.
Abstract: The generalisation and robustness properties of policies learnt through Maximum-Entropy Reinforcement Learning are investigated on chaotic dynamical systems with Gaussian noise on the observable.
First, the robustness under noise contamination of the agent's observation of entropy regularised policies is observed.
Second, notions of statistical learning theory, such as complexity measures on the learnt model, are borrowed to explain and predict the phenomenon.
Results show the existence of a relationship between entropy-regularised policy optimisation and robustness to noise, which can be described by the chosen complexity measures.
Already Accepted Paper At Another Venue: already accepted somewhere else
Submission Number: 97
Loading