Inertia and fear of lagging behind drive unsafe technological development in an idealised AI Race experiment

Elias Fernández Domingos; The Anh Han

Inertia and fear of lagging behind drive unsafe technological development in an idealised AI Race experiment

Elias Fernández Domingos, The Anh Han

Published: 31 Oct 2025, Last Modified: 31 Oct 2025BNAIC/BeNeLearn 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Type E (Late-Breaking Abstracts)

Keywords: AI Race, Behavioural Experiments, Evolutionary Game Theory

Abstract: The success and popularisation of Large Language Models (LLMs) has shown how easily humans adopt disruptive technologies to assist in decision-making for daily tasks. For example, Potter et al. \cite{potter2024llms} cites several studies demonstrating that LLMs with radical political leanings can, at least in the short term, influence their users' views. This highlights how the involvement of technologies with inherently obscure (black-box) designs in crucial social and (in this case, democratic) decisions may significantly impact the future configuration of our societies. The safety measures taken in the development of such technologies are essential to mitigate the risks of unexpected and potentially catastrophic effects (e.g., increased misinformation, biases, and inequality). However, excessive regulation of technological development can also be detrimental for beneficial innovation \cite{han2019regulate}. Therefore, investigating the incentive mechanisms and intrinsic motivations behind human decisions that lead to either unsafe or safe technological development in a theoretical technology race has the potential to inform the design of tailored regulations and policies. Here, we present the results of a framed behavioural experiment in which participants take on the role of the head of a company. They make binary decisions over several rounds to develop their company's technology either safely or unsafely in order to win a race for technological supremacy against another player (a two-player game). Throughout an undefined number of rounds (on average, 10), participants accumulate both round payoffs and development steps. In a single round, the only Nash equilibrium and the social optimum occur when both players choose the unsafe option. Additionally, unsafe development advances the player by 1.5 steps in the game, while safe development advances them by only 1 step. However, with each unsafe choice, a player's private risk increases, meaning that the final accumulated private risk is the fraction of unsafe choices made throughout the game, up to a maximum risk threshold. When the game ends, the player with the most steps wins a bonus payoff. Furthermore, only the winning player is liable for private risk, which could result in a final payoff of zero. We manipulate the maximum private risk at three levels: 10\%, 60\%, and 90\%. Moreover, we base this experiment on the evolutionary game-theoretical model introduced in \cite{han2019regulate} and hypothesise that most participants at a 90\% maximum risk level will choose to develop their technology safely, while at 60\% and 10\%, most participants will opt for the unsafe option. However, we find significant differences in the total frequency of unsafe decisions only between the 10\% and the other two levels, while no difference is found between 60\% and 90\%. Moreover, we find that participants are willing to signal from round 1 that they will develop the technology unsafely to win the race, regardless of the manipulated risk. We also find that inertia plays a major role in participant's decisions, as well as the interaction the between the distance in the race and whether the opponent has played unsafe in the previous round (see Table~\ref{tab:complex_model2}). Finally, in contrast to the model of \cite{han2019regulate}, participants often choose conditional strategies that begin with unsafe decisions rather than safe ones. Updating the model with this finding yields results that align more closely with our experimental outcomes. Overall, we find that participants are willing to take high risks to win the race, underscoring the need for external regulation to prevent catastrophic events associated with unsafe technological development.

Serve As Reviewer: ~Elias_Fernández_Domingos2

Submission Number: 73

Loading