A Tighter Problem-Dependent Regret Bound for Risk-Sensitive Reinforcement LearningDownload PDFOpen Website

Published: 2023, Last Modified: 08 Jul 2023AISTATS 2023Readers: Everyone
Abstract: We study the regret for risk-sensitive reinforcement learning (RL) with the exponential utility in the episodic MDP. Recent works establish both a lower bound $\Omega((e^{|\beta|(H-1)/2}-1)\sqrt{SA...
0 Replies

Loading