Published: 01 Jan 2022, Last Modified: 30 Apr 2023UAI 2022Readers: Everyone
Abstract:We consider the problem of constrained Markov Decision Process (CMDP) where an agent interacts with an ergodic Markov Decision Process. At every interaction, the agent obtains a reward and incurs $...