Rational Multi-Objective Agents Must Admit Non-Markov Reward Representations

Silviu Pitis; Duncan Bailey; Jimmy Ba

Rational Multi-Objective Agents Must Admit Non-Markov Reward Representations

Silviu Pitis, Duncan Bailey, Jimmy Ba

Published: 05 Dec 2022, Last Modified: 05 May 2023MLSW2022Readers: Everyone

Abstract: This paper considers intuitively appealing axioms for rational, multi-objective agents and derives an impossibility from which one concludes that such agents must admit non-Markov reward representations. The axioms include the Von-Neumann Morgenstern axioms, Pareto indifference, and dynamic consistency. We tie this result to irrational procrastination behaviors observed in humans, and show how the impossibility can be resolved by adopting a non-Markov aggregation scheme. Our work highlights the importance of non-Markov rewards for reinforcement learning and outlines directions for future work.

1 Reply

Loading