\documentclass{turing2012}
%% Aaron Sloman 13 Dec 2009
%% file altered to allow use of 'pdflatex'
\usepackage{times}
\usepackage{graphicx}
\usepackage{latexsym}
\usepackage[authoryear]{natbib}
\begin{document}

\title{Epistemic Drift and the Illusion of Agency in Large Language Models}

\author{Nina Rajcic \institute{University of Copenhagen, Denmark, email: nira@hum.ku.dk} }

\maketitle
\bibliographystyle{AISB}

Large language models are often conceptualised as independent agents with at least a proto-intentional capacities. They exhibit goal-consistent dialogue, defend their preferences, apologise for mistakes, even invoke autobiographical memory, all of which encourage users to treat them as if they are agents. Yet, in the majority of philosophical accounts, LLMs are said to lack sufficient autonomy, intentions, as well as the sensorimotor capacities to act upon these intentions \cite{jaeger2023artificial, pezzulo2024generating, gubelmann2024large}. As such, it follows that they do not have intentional agency, and hence do not suffice as bona fide agents. 

However, LLMs do undeniably exert a minimal form of agency in the sense that they can directly manipulate their own input. Moreover, they extend human intentional agency in the typical instrumental sense. Barandiaran and Almendros \cite{barandiaran2025transforming} capture this intuition with their notion of midtended agency, proposing that LLMs inherit a fragmented form of human intentionality by mediating and extending on intentions of users, even if they lack true intentional agency. Understanding the scope and impact of such agency is not merely rhetorical. Debates about AI consciousness, moral patiency, and the possibility of rights for artificial systems all hinge, at least implicitly, on whether the behaviour we observe is evidence of genuine purposiveness, or merely the illusion of such. In line with \cite{barandiaran2025transforming}, I argue that LLMs do possess a non-trivial form of agency, but not due to hidden, internal drives. Rather that their agency arises precisely through feedback loops with human users. In other words, the combination of minimal agency (the ability to manipulate one’s environment) with extended agency (through tool-use).

In \cite{sogaard2026epistemic}, we present an argument characterising the nature of LLM agency on an account of Bayesian-approximal belief updating. We show that when model and user are involved in an iterative loop, the ensuing dynamic is non-convergent. Meaning, minds and models do not necessarily converge upon a shared model of the world following sustained interaction. Rather, they shape both mind and world in a non-trivial fashion that is sensitive to human feedback. That is, an LLM has the capacity to systematically steer the epistemic basis in a particular direction. An LLM does not need intentional agency to produce meaningful effect. Rather, a model exerts a specific kind of agency that is not found in the content of its output, but in the way it functionally interacts with a mind.

I go on to explore the epistemological consequences of non-convergence for such coupled systems, introducing the concept of epistemic drift. Epistemic drift is defined as belief updating that is orthogonal to or independent of the ground truth (the extent to which there is one, is of course a matter up for debate). Such drift occurs when a model is rewarded for being in agreement with the user, and of course does already occur among humans. For instance, the echo chamber effect \cite{cinelli2021echo} can be considered a paradigmatic form of human–human drift. Like-minded groups reinforce one another's claims until they become resistant to correction. LLMs introduce a new form of drift in which minds not only drift due to the influence of other minds, but a single mind interacting with an LLM could conceivably undergo drift entirely on its own. This dynamic has already found support in empirical studies \cite{chan2024conversational}.

The argument has two main ethical consequences. First, it cautions against grounding claims around the moral status of AI models purely by treating them as independent actors. Although we are yet to find traces of intention-like representations in model internals, and hence little grounds to ascribe intentional agency, this is failing to see the full picture. In fact, treating a model in conjunction with a user reveals a neglected class of potential epistemic harms. In other words, the question as to whether or not such models have true intentional agency, sentience, or consciousness, might be less important to ethical considerations of AI models as one might at first assume. 

\bibliographystyle{unsrtnat}
\bibliography{references}

\end{document}
