Towards a more Unified, Explicable, and Generalized Representation of Human Utility

03 Dec 2023 (modified: 26 Jan 2024)PKU 2023 Fall CoRe SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: utility, RL
Abstract: Utility determines our preferences about decisions that involve uncertainty in every aspect of life. But as it is both hierarchical and multidimensional, there exists no unified representation. This essay discusses two basic ways to obtain reward function, reward engineering and inverse learning without reward engineering, under the framework of RL. Based on learning ways, three kinds of representations of human utility, prior knowledge, human feedback, and intrinsic motivation are compared in interpretability and generalization. Finally, we argue that intrinsic motivation could provide more general explanations, and seeking a unified representation of different motivation dimensions is worth attention.
Submission Number: 167
Loading