The Lock-in Hypothesis: Stagnation by Algorithm

Tianyi Qiu; Zhonghao He; Tejasveer Chugh; Max Kleiman-Weiner

The Lock-in Hypothesis: Stagnation by Algorithm

Tianyi Qiu, Zhonghao He, Tejasveer Chugh, Max Kleiman-Weiner

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: Demonstrations of LLM-induced "lock-in" of user ideation, based on real-world data, simulations, and a formal model.

Abstract: The training and deployment of large language models (LLMs) create a feedback loop with human users: models learn human beliefs from data, reinforce these beliefs with generated content, reabsorb the reinforced beliefs, and feed them back to users again and again. This dynamic resembles an echo chamber. We hypothesize that this feedback loop entrenches the existing values and beliefs of users, leading to a loss of diversity in human ideas and potentially the *lock-in* of false beliefs. We formalize this hypothesis and test it empirically with agent-based LLM simulations and real-world GPT usage data. Analysis reveals sudden but sustained drops in diversity after the release of new GPT iterations, consistent with the hypothesized human-AI feedback loop. *Website: https://thelockinhypothesis.com*

Lay Summary: AI language models, like ChatGPT, learn from the vast amounts of text humans create. As these AIs generate new content, they can echo and reinforce our existing beliefs, creating a feedback loop where the AI learns our views, feeds them back to us, and then re-absorbs these potentially amplified views. We hypothesized that this cycle might inadvertently trap society in its current ways of thinking, reducing the diversity of our ideas and potentially leading to the widespread entrenchment of false or harmful beliefs – a phenomenon we term "value lock-in." To investigate this, we developed a mathematical model and ran computer simulations which demonstrated how this belief entrenchment could occur through such human-AI feedback. We then tested our "Lock-in Hypothesis" by analyzing real-world data from millions of interactions with ChatGPT. Our findings revealed noticeable and sustained drops in the variety of concepts discussed by users immediately following the release of new AI model versions, which are trained on updated human data. This suggests that the ongoing human-AI interaction could be subtly making our collective thoughts more uniform over time. Such a trend, if continued, poses a risk of hindering societal innovation, critical thinking, and our ability to correct widely held misbeliefs.

Primary Area: Social Aspects->Alignment

Keywords: Value lock-in, Human-AI interaction, AI Alignment

Submission Number: 13294

Loading