Economic models of alignment and extrapolated volition

06 Jul 2025 (modified: 07 Jul 2025)ODYSSEY 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: alignment, extrapolated volition, economics, mechanism design, information economics
Abstract: We propose a simple formal definition of AI alignment motivated by an economic analogy. Market failures (such as imperfect information and imperfect assurance) naturally correspond to misalignment modes: in particular, though we leave this for future work, we believe that a ``recursive information markets'' mechanism naturally leads us to a formalization of extrapolated volition in terms of a value-of-information expression.
Serve As Reviewer: ~Abhimanyu_Pallavi_Sudhir1
Confirmation: I confirm that I and my co-authors have read the policies are releasing our work under a CC-BY 4.0 license.
Submission Number: 23
Loading