Evidence on the regularization properties of Maximum-Entropy Reinforcement Learning

Remy Hosseinkhan Boucher; Lionel Mathelin; Onofrio Semeraro

Evidence on the regularization properties of Maximum-Entropy Reinforcement Learning

Remy Hosseinkhan Boucher, Lionel Mathelin, Onofrio Semeraro

Published: 01 Aug 2024, Last Modified: 09 Oct 2024EWRL17EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Maximum-Entropy Reinforcement Learning, Robustness, Complexity Measures, Flat Minima, Fisher Information, Regularisation

TL;DR: This work shows links between policy robustness for Maximum Entropy RL and complexity measures borrowed from statistical learning theory.

Abstract: The generalisation and robustness properties of policies learnt through Maximum-Entropy Reinforcement Learning are investigated on chaotic dynamical systems with Gaussian noise on the observable. First, the robustness under noise contamination of the agent's observation of entropy regularised policies is observed. Second, notions of statistical learning theory, such as complexity measures on the learnt model, are borrowed to explain and predict the phenomenon. Results show the existence of a relationship between entropy-regularised policy optimisation and robustness to noise, which can be described by the chosen complexity measures.

Already Accepted Paper At Another Venue: already accepted somewhere else

Submission Number: 97

Loading