Stochastic Approximation of Gaussian Free Energy for Risk-Sensitive Reinforcement Learning

Grégoire Delétang; Jordi Grau-Moya; Markus Kunesch; Tim Genewein; Rob Brekelmans; Shane Legg; Pedro A Ortega

Stochastic Approximation of Gaussian Free Energy for Risk-Sensitive Reinforcement Learning

Grégoire Delétang, Jordi Grau-Moya, Markus Kunesch, Tim Genewein, Rob Brekelmans, Shane Legg, Pedro A Ortega

21 May 2021 (modified: 05 May 2023)NeurIPS 2021 SubmittedReaders: Everyone

Keywords: risk-sensitivity, free energy, stochastic approximation, model-free reinforcement learning, safety

TL;DR: We introduce a stochastic approximation rule for estimating the free energy, and show how to apply it to risk-sensitive RL.

Abstract: We introduce a stochastic approximation rule for estimating the free energy from i.i.d. samples generated by a Gaussian distribution with unknown mean and variance. The rule is a simple modification of the Rescorla-Wagner rule, where the (sigmoidal) stimulus is taken to be either the event of over- or underestimating a target value. Since the Gaussian free energy is known to be a certainty-equivalent sensitive to the mean and the variance, the learning rule has applications in risk-sensitive decision-making. In particular, we show how use the rule in combination with the temporal-difference error in order to obtain risk-sensitive, model-free reinforcement learning algorithms.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

12 Replies

Loading