Rethinking the Expressivity of Markov Reward: Reward is all you need

Yuntian Gu

Rethinking the Expressivity of Markov Reward: Reward is all you need

Yuntian Gu

04 Dec 2023 (modified: 26 Jan 2024)PKU 2023 Fall CoRe SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Markov Reward

Abstract: This paper engages in the ongoing debate on the nature and role of reward systems in human and artificial intelligence. Central to our discussion is the "Reward is enough" hypothesis, which posits that complex behaviors and cognitive functions in both humans and artificial intelligence can be reduced to and driven by reward maximization. We explore the theoretical underpinnings of this hypothesis, its practical implications, and the critiques and alternatives proposed in the literature. In particular, we focus on the expressivity of Markov Chain models and the potential for augmenting state spaces to encapsulate complex tasks that were previously thought to defy representation by traditional Markov reward functions. Our analysis provides insights into the sufficiency of reward maximization as a unifying principle for intelligence and the potential of expanded state spaces in addressing the limitations of current models.

Submission Number: 181

Loading