Formalizing Embeddedness Failures in Universal Artificial Intelligence

01 Jul 2025 (modified: 02 Jul 2025)ODYSSEY 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Universal artificial intelligence, Solomonoff induction, evidential decision theory
TL;DR: We investigate simple variants of the AIXI agent designed to capture problems of embedded agency.
Abstract: We rigorously discuss the commonly asserted failures of the AIXI reinforcement learning agent as a model of embedded agency. We attempt to formalize these failure modes and prove that they occur within the framework of universal artificial intelligence, focusing on very simple variants of AIXI. We introduce joint AIXI, which models the joint action/percept history as drawn from the universal distribution, and hardened AIXI, which recovers from side-channel attacks by recalculating its own previous actions. We also evaluate the progress that has been made towards a successful theory of embedded agency based on variants of the AIXI agent.
Serve As Reviewer: ~Cole_Wyeth1
Confirmation: I confirm that I and my co-authors have read the policies are releasing our work under a CC-BY 4.0 license.
Submission Number: 9
Loading