Keywords: Large Language Models (LLMs), Self-Refinement, Policy Drift, Spectral Radius, Operator Theory, Convergence Analysis, Stability Guarantees, Reflective Prompting, Autonomous Agents, Control Theory
TL;DR: Our paper develops a spectral-operator framework to analyze and guarantee stability in self-refining LLMs to show that the spectral radius of the policy update operator predicts convergence or drift in autonomous reasoning loops.
Abstract: Autonomous large language model (LLM) agents are operating more in self-refining loops, and update their internal reasoning or policy based on prior outputs. While this self-improvement model enhances capability, it potentially introduces the risk of uncontrollable policy drift. Accordingly, this paper develops a theoretical framework for analyzing such dynamics using spectral operator theory. The paper models self-refinement as a nonlinear transformation \(T\) over policy space and derives sufficient conditions for convergence to an aligned fixed point \(\pi^*\) via the spectral radius of the Jacobian \(J_T(\pi^*)\). We establish that if \(\rho(J_T(\pi^*))<1\), the agent’s refinement process is contractive and guarantees bounded drift and interpretable convergence. A small-scale empirical illustration using reflective prompting demonstrates estimation of contraction coefficients in LLM reflection loops. Our results provide the first spectral guarantees for stability and controllability in autonomous LLM agents.
Submission Number: 74
Loading